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INTRODUCTION 


In  many  practical  situations,  the  experimenter  (or  the  decision¬ 
maker)  is  faced  with  the  problem  of  comparing  k  {>_  2)  populations, 
where  each  population  is  characterized  by  a  real-valued  parameter  e. 
In  such  situations,  the  classical  approach  is  to  test  the  hypothesis 
of  homogeneity  (equality)  among  the  k  parameters.  On  the  other  hand, 
the  real  interest  (or  goal)  of  the  experimenter  may  be  to  identify 
the  best  population  (defined  by  the  experimenter  in  terms  of,  say, 
large  value  of  e)  or  to  find  a  subset  which  contains  the  best 
population  or  a  subset  which  contains  all  populations  better  than 
a  control  or  standard.  Thus,  the  test  of  homogeneity  is  inadequate 
in  several  aspects.  Mosteller  (1948),  Paulson  (1949),  Bahadur 
(1950)  and  Bahadur  and  Robbins  (1950)  were  among  the  earliest 
research  workers  to  recognize  this  inadequacy.  Since  these  early 
studies,  the  area  of  selection  and  ranking  problems  has  been  very 
active.  It  has  seen  tremendous  growth  over  the  last  three  and  a  half 
decades. 


There  have  been  mainly  two  formulations  in  selection  and 


ranking  problems,  namely,  the  "indifference  zone"  approach  and  the 
"subset  selection"  approach.  In  the  first  formulation,  due  to 
Bechhofer  (  1954  ),  the  goal  is  to  select  one  population  (or  a 


fixed  number  t,  1  <  t  <  k)  as  the  best  population  with  a  preassigned 
minimum  probability  P*.  whenever  the  unknown  parameters  lie  outside 
some  subspace  of  the  parameter  space,  the  so-called  indifference  zone 
Important  contributions  using  this  approach  have  been  made  by 
Bechhofer  and  Sobel  (1954),  Bechhofer,  Dunnett  and  Sobel  (1954), 

Sobel  (1967),  Mahamunulu  (1967),  Paulson  (1967),  Bechhofer,  Kiefer 
and  Sobel  (1968),  Desu  and  Sobel  (1968,  1971),  Dudewicz  and  Dalai 
(1975),  Tamhane  and  Bechhofer  (1977,  1979),  among  others. 

In  the  second  formulation,  pioneered  by  Gupta  (1956,  1965), 
the  goal  is  to  select  a  nonempty  nontrivial  subset  of  k  populations 
so  that  the  best  population  is  included  in  the  selected  subset  with 
a  minimum  guaranteed  probability  P*(|-  <  P*  <  1)  over  the  whole 
parameter  space.  The  size  of  the  selected  subset  is  not  determined 
in  advance  but  is  made  to  depend  on  the  outcome  of  the  experiment. 
Some  recent  contributions  in  this  formulation  have  been  made  by  Gupta 
and  Studden  (1970),  Gupta  and  Nagel  (1971),  Gupta  and  Panchapakesan 
(1972),  Santner  (1975),  Gupta  and  Huang  (1975a,  1975b),  Gupta  and 
Huang  (1976  ),  Bickel  and  Yahav  (1977  ),  Gupta  and  Hsiao  (1983  ), 
Gupta  and  Huang  (  1980),  Lorenzen  and  McDonald  (  1981).  Contribu¬ 
tions  to  the  nonparametric  subset  selection  procedures  have  been 
made  by  Rizvi  and  Sobel  (1967),  Barlow  and  Gupta  (1969  ),  Nagel 
(1970  ),  Gupta  and  McDonald  (1970  ),  Randles  (1970  ),  Ghosh  (1973  ), 
Hsu  (1978,  1981),  Huang  and  Panchapakesan  (1982). 

Recently  some  contributions  to  the  selection  and  ranking 
procedures  based  on  isotonic  estimators  have  been  made  by  Gupta  and 


Yang  (  1984),  Gupta  and  Huang  (  1983),  Gupta  and  Leu  (1983b), 

Huang  (  1984). 

There  have  also  been  some  contributions  to  the  selection  and 
ranking  procedures  in  two  stages.  These  are  relevant  when,  for 
example,  the  experimenter  wants  to  select  a  subset  of  populations 
(under  investigation)  which  contains  the  populations  of  interest  so 
that  the  populations  in  the  selected  subset  can  be  examined  further. 
Some  important  contributions  in  this  direction  have  made  by  Santner 
(  1976),  Mukhopadhyay  (1980),  Gupta  and  Kim  (1984)  under  the  classi¬ 
cal  setting,  and  Miescke  (1980,  1983),  Gupta  and  Miescke  (1982), 

Gupta  and  Miescke  (1984)  under  the  Bayesian  setting. 

For  further  developments  in  both  formulations,  reference  can 
be  made  to  Gupta  and  Panchapakesan  (  1979)  (see  also  Gibbons,  Olkin 
and  Sobel  (  1977  ),  Gupta  and  Huang  (  1981  ),  and  Dudewicz  and  Koo 
(  1982  )). 

The  main  contribution  of  this  thesis  is  to  propose  and 
study  new  subset  selection  procedures  for  some  important  and  practical 

problems  for  the  generalized  family  of  lambda  distributions.  It  should 
be  pointed  out  that  the  family  of  Tukey's  generalized  lambda 
distributions  is  very  broad  and  contains  most  well-known  distributions 
as  special  cases. 

Chapter  I  deals  with  selection  and  ranking  procedures  based  on 
sample  medians  for  the  symmetric  lambda  distributions  and  applications 
of  the  lambda  family  of  distributions.  We  investigate  some  properties 
of  the  lambda  family  of  distributions.  We  also  propose  some  selection 
procedures  and  study  the  properties  of  these  procedures  such  as 
asymptotic  relative  efficiencies.  An  application  of  the  lambda 


4 


distribution  for  approximating  some  constants  used  in  the  selection 
and  ranking  procedures  for  other  symmetric  theoretical  distributions 
is  made.  Tables  of  associated  constants  for  the  proposed  procedures 
are  given  in  this  chapter. 

Chapter  II  deals  with  the  problem  of  isotonic  selection 
procedures  for  the  family  of  lambda  distributions  and  for  logistic 
distributions.  We  propose  and  study  some  isotonic  procedures 
for  symmetric  lambda  distributions  and  for  logistic  distributions.  In 
particular,  we  investigate  the  aporoximations  of  constants  used  in 
the  proposed  procedures.  It  is  shown  that  the  isotonic  procedure  is 
better  than  some  classical  procedures  in  terms  of  reducing  the 
expected  number  of  bad  populations  in  the  selected  subset.  Tables 
of  associated  constants  for  the  proposed  procedures  are  given  in 
this  chapter. 

Chapter  III  deals  with  the  problem  of  choosing  the  optimal 
score  function  for  different  nonparametric  procedures  proposed  by 
Nagel  (1970)  and  Gupta  and  McDonald  (  1970).  The  Tukey's  lambda 
family  of  distributions  is  considered  as  the  distribution  for  the 
score  function.  A  Monte  Carlo  study  for  the  optimal  choice  of  the 
score  function  is  carried  out.  This  study  indicates  that  the  score 
function  based  on  a  uniform  distribution  is  optimal  and  robust  against 
possible  deviations  from  the  underlying  distributions.  Tables  contain¬ 
ing  the  values  of  score  functions  and  the  results  of  the  simulations 
are  given  in  this  chapter. 

Chapter  IV  deals  with  the  problem  of  an  elimination- type  two- 
stage  selection  procedure  under  the  Bayesian  setting.  We  propose  a 


two-stage  procedure  R(a,d)  which  retains  good  populations  at  the 
first  stage,  and  selects  the  best  among  selected  populations.  At 
Stage  2  we  use  a  stopping  rule  to  construct  a  100(l-2o)%  Highest 
Posterior  Density  (HPD)  credible  region  with  a  common  width  2d  for 
the  unknown  means  of  selected  populations.  We  study  the  properties 
of  the  rule  R(a,d).  Several  figures  are  drawn  to  examine  the 
performance  of  the  procedure  R(a,d).  These  figures  are  based  on  the 
results  of  a  Monte  Carlo  study. 


CHAPTER  I 


SELECTION  AND  RANKING  PROCEDURES  FOR  TUKEY'S 
GENERALIZED  LAMBDA  DISTRIBUTIONS 

1.1  Introduction 

Tukey's  generalized  lambda  distribution  (hereafter  called 
lambda  distribution)  was  suggested  by  Tukey  (  i960  )  as  a  wide 
class  of  symnetric  distributions  and  is  defined  in  terms  of  its 
inverse  cumulative  distribution  function.  It  has  been  generalized 
by  Ramberg  and  Schmeiser  (1972,  1974  )  so  as  to  include  both 
symmetric  and  asymmetric  distributions.  Originally,  Ramberg  and 
Schmeiser  (1972,  1974  )  generalized  and  used  the  lambda  distribution 
for  the  purpose  of  generation  of  continuous  unimodal  symmetric  and 
asymmetric  random  variates  since  it  is  well  known  that  the  lambda 
distribution  can  be  used  to  approximate  many  continuous  theoretical 
distributions  and  empirical  distributions.  Therefore,  since  the 
work  of  Ramberg  and  Schmeiser  (  1972,  1974  )  the  lambda  distribution 
has  been  also  used  for  Monte  Carlo  studies.  Moberg,  Ramberg  and 
Randles  (1978)  have  used  the  lambda  distribution  for  Monte  Carlo 
studies  to  check  the  robustness  of  the  adaptive  M-estimator  for  the 
selection  problem  under  the  Indifference  zone  approach  formulation. 
Also  Ramberg,  Tadikamalla,  Dudewlcz  and  Mykytka  (1979)  have  used  the 
lambda  distribution  to  fit  a  distribution  to  a  given  set  of  data. 


They  also  provided  a  useful  table  for  various  values  of  parameters  of 
the  lambda  distribution  for  given  combinations  of  skewness  and 
kurtosis.  Hogg,  Fisher  and  Randles  (1972)  have  studied  the 
(empirical)  power  of  the  adaptive  distribution-free  test  by  using  the 
lambda  distribution  for  various  combinations  of  skewness  and 
kurtosis.  Fill i ben  (1969)  has  used  the  lambda  distribution  for 
estimating  the  location  parameters  of  symmetric  distributions. 

Joiner  and  Rosenblatt  (1971)  have  studied  the  problem  of  the  distri¬ 
bution  of  ranges  of  samples  from  the  lambda  distribution.  Mykytka 
and  Ramberg  (1979)  and  Oztiirk  and  Dale  (1985)  have  studied  the 
problem  of  estimating  the  parameters  of  the  lambda  distribution 
with  a  given  data  set. 

If  we  confine  ourselves  to  the  class  of  unimodal  continuous 
univariate  distributions,  skewness  and  kurtosis  can  be  used  as  good 
measures  to  characterize  a  distribution.  The  lambda  distribution  is 
defined  by  values  of  its  parameters  which  are  determined  by  its  first 
four  central  moments.  The  lambda  distribution  covers  both  symmetric 
and  asynmetric  distributions.  The  family  of  Burr  distributions 
(1942,  1973  )  is  also  a  general  system  of  distributions,  which  is 
defined  by  two  constants  which  determine  the  corresponding  skewness, 
kurtosis,  mean  and  variance.  The  Burr  family,  however,  is  much  more 
difficult  to  handle  than  the  lambda  distribution  family  because  the 
values  of  two  constants  of  the  Burr  distribution  do  not  provide  a 
clear  interpretation  of  its  skewness  and  kurtosis.  On  the  other  hand, 
the  lambda  distribution  is  clearly  defined  by  the  location,  scale  and 
shape  parameters  which  are  directly  related  to  the  skewness  and 


kurtosis.  The  Pearson  and  Johnson  systems  (see  Hahn  and  Shapiro 
(1967)),  again,  require  several  different  functions  to  cover  the 
classes  of  symmetric  and  asymmetric  distributions.  On  the  other  hand, 
the  lambda  distribution  family  is  defined  by  only  one  function  and 
still  it  covers  both  symmetric  and  asymmetric  distributions.  Thus  the 
family  of  lambda  distributions  is  simple,  flexible,  and  easy  to  use  as 
well  as  it  is  quite  broad  and  general.  Hence  the  use  of  the  lambda 
distribution  as  a  model  for  selection  and  ranking  problems  provides 
results  applicable  to  several  parametric  distributions,  at  least,  to 
get  approximate  results.  Also  by  changing  the  values  of  the  parameters, 
we  can  examine  the  performance  of  the  selection  procedures  by  taking  into 
consideration  the  given  data.  For  example,  if  based  on  a  given 
sample,  one  believes  that  the  underlying  distribution  is  a  heavy- 
tail  distribution,  somewhere  between  the  logistic  and  double  exponen¬ 
tial,  then  for  this  case  one  can  assume  the  lambda  distribution  with 
several  sets  of  values  of  parameters  which  are  determined  by  the 
kurtosis,  which,  in  this  case,  varies  between  4.2  and  6.0.  Again  one 
can  examine  the  robustness  of  any  selection  procedure  due  to  several 
assumptions  on  the  underlying  distribution. 

Recently  several  computer  package  programs  in  the  field  of  selec¬ 
tion  and  ranking  have  been  developed  by  several  authors.  For  example, 
the  package  RS -MCB  is  developed  by  Gupta  and  Hsu  (1984a,  1984b)  and 
Edwards  (1984a,  1984b)  has  developed  the  package  RANKSEL.  But  these 
package  programs  mainly  deal  with  the  normal  models.  But  it  is 
possible  to  modify  these  package  programs  to  cover  more  models  because 
the  precision  of  the  approximation  in  using  the  lambda  distribution  ii 


very  good.  We  will  discuss  this  further  in  Sections  2  and  4  of 
Chapter  1 . 

It  is  well  known  that  for  a  synmetric  distribution  the  sample 
median  is  an  unbiased  estimate  of  the  location  parameter  and  is  robust 
in  the  presence  of  contamination  from  heavy-tailed  distributions. 

Hence  selection  procedures  based  on  the  sample  medians,  under  the 
formulation  of  the  subset  selection  approach,  have  been  developed  for 
several  distributions.  Gupta  and  Leong  (1979)  have  considered  a 
procedure  for  selecting  the  largest  of  location  parameters  for  the 
case  of  double  exponential  or  Laplace  distributions.  Gupta  and  Singh 
(1980  )  have  studied  the  case  of  normal  distributions  and  Lorenzen 
and  McDonald  (1981  )  have  considered  the  case  of  logistic  distribu¬ 
tions. 

Here  we  consider  some  selection  pri.'edures  based  on  sample 
medians  for  selecting  the  population  associated  with  the  largest  loca¬ 
tion  parameter  among  k  populations  whose  observable  characteristics 
follow  lambda  distributions. 

In  Section  1 .2 ,  we  define  the  lambda  distribution  and  also  discuss 
some  properties  including  tail-ordering. 

In  Section  1.3,  the  problem  of  selecting  the  population  associated 
with  the  largest  location  parameter  is  studied  for  both  the  subset 
selection  approach  and  the  indifference  zone  approach  for  the  symme¬ 
tric  lambda  distribution.  Some  new  selection  procedures  are  proposed. 
The  properties  of  these  procedures  such  as  asymptotic  relative 
efficiencies  (ARE)  are  studied.  Also  tables  of  constants  necessary  to 
carry  out  the  procedures  along  with  ARE’s  of  the  proposed  selection 
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procedures  are  computed  and  tabulated.  Comparisons  of  the  rules  baser 
on  medians  with  the  selection  rules  based  on  sample  means  are  provided 
for  the  case  of  symmetric  lambda  distributions  with  different  values 
of  parameters. 

In  Section  1 .4,  an  application  of  the  lambda  distribution  for 
approximating  some  constants  used  in  the  selection  and  ranking  prob¬ 
lems  for  other  synmetric  theoretical  distributions  is  studied.  Com¬ 
parisons  between  exact  values  and  approximated  values  are  made  for  the 
case  of  logistic  distributions. 

As  a  closing  remark,  since  the  lambda  distribution  can  be  used 
to  approximate  theoretical  continuous  distributions,  one  can  get  many 
(approximate)  results  including  evaluations  of  constants  used  in  the 
various  parametric  situations  for  selection  and  ranking  problems  by 
using  a  lambda  distribution  by  choosing  values  of  its  parameters 
properly. 

At  the  end  of  this  chapter.  Table  1.1  is  provided  for  values  of 
the  scale  and  shape  parameters  for  symmetric  distributions  for 
various  values  of  the  kurtosis  ranging  from  1.8  to  9.0  with  steps 
of  0.1.  This  table  gives  8  significant  digits  and  this  is  an 
improvement  over  the  table  of  Ramberg,  Tadikamalla,  Dudewicz  and 
Mykytka  (1979)  in  terms  of  both  its  scope  and  precision  for  the 
symmetric  case. 

1.2  Definition  and  Properties  of  the  Lambda  Distribution 
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Definition  1.2.1.  Let  e,  5,  y-j,  ',2  €  K1.  where  >  0,  g.-,2  '  0 
and  >i • >  0-  Let  F(-)  denote  the  cumulative  distribution  function 
(cdf)  of  a  distribution  and  let  F~^(«)  be  its  inverse.  Then  for 
0  <  p  <  1  and  x  €  IR1 ,  the  lambda  distribution  F(x)  is  defined  by  its 
inverse  cdf  as 

(1.2.1)  x  =  F_1(p)  -6+1  xp']  -  (1-p)’2), 

where  e  and  6  are  location  and  scale  parameters,  respectively,  and 
Y-]  and  y2  are  shape  parameters. 

If  y -j  *  y2>  The  lambda  distribution  is  symmetric.  The  moments 
and  the  support  of  the  distribution  depend  upon  e,  and  y^.  For 
example,  for  e .  >  0,  >  0  and  y2  >  0,  it  has  all  positive  moments  of 

all  order  and  its  support  is  the  interval  (e-l/B,  8+1 /b).  On  the 
other  hand, for  <  -1,  y2  >  1  and  y-j  >  1 ,  y2  <  -1 ,  there  exist  no 
positive  moments.  Ramberg,  Tadikamalla,  Dudewicz  and  Mykytka  (1979) 
have  studied  these  properties  in  detail  and  have  provided  some  figures 
which  characterize  well-known  continuous  distributions  by  their  stand¬ 
ard  third  and  fourth  moments.  Here  we  assume  that  the  signs  of  both 
scale  and  shape  parameters  are  the  same  for  the  symmetric  case. 

The  mean,  the  variance,  and  the  third  and  fourth  central  moments 
of  the  lambda  distribution  are  given  by 

(1.2.2)  U]  =  e+(l/(y1+l)  -  1/(y2+1))/B, 

(1.2.3)  u2  =  {[l/(2y1+l)-2Be(y1+l,  y2+1  )  +  1/(2y2+1)]  - 

-  [l/(y,+l)  -  l/(y?+l)]2)/62, 


0.2.4) 


{ [l/(3>i+l  )-3Be(2Yi+l ,  y2+1)  +  3Be(Y-|+l,  2y2+1)  - 


v 


3  * 


-  1/(3y2+D]  -  3[1/(2y1+1)  -  2Be(Yl+l,  y2+1  )  + 
+  1/(2y2+1)][1/(y1+1)  -  1/(y2+D]  + 

+  2[V(y-j+1  )  -  1/(y2+1)]3>/P3. 


and 

(1.2.5)  s  {[1/(4y-j+1  )-4Be(3Y-|+l ,  y2+1)  +  6Be(2Y-j+l,  2y2+1 )  - 

-  486^+1 ,  3y2+1)  +  1/(4y2+D]  -  4[1/(3y1+D  - 

-  3Be(2Yi+l»  y2+D  +  3Be(Y^+l»  2y2+1)  -  1/(3y2+1)] 
[1/(y,+1)  -  1/(y2+D]  +  6C1/(2y1+1)  -  2Be(Y1+l,Y2+l)  + 

♦  1/2(y2-*-1)][1/(y1+D-1/(y2+1)]2-3[1/(y1+1)  - 

-  1/(y2+D]4}/64, 

respectively,  where  Be(a,b)  is  the  beta  function  with  parameters  a  and 
b.  For  the  symmetric  case,  i.e.,  y-j  *  y2  *  y,  these  can  be  simplified 
as 

(1 .2.6)  w-j  *  8, 


(1.2.7) 


u,  «  2[l/(2Y+l)-Be(Y+l,  y+1 )]/B2, 


u3  *  0 


(1.2.9)  w4  =  2[l/(4y+l )-4Be(3y+l ,  y+1 )  +  3Be(2y+l,  2y+l)]/B' 


Hence  the  standardized  fourth  moment  called  kurtosis  or  a  measure  of 


peakedness,  denoted  by  ‘is 


M  ?  ini  -  V(4rH)  -  4Be(3y+l,  y+1)  +  3Be(2y+l,  2y+l  ] 

’  1  2  1/(2yl>  -  WvWDJ2 


Now  we  discuss  some  other  properties  of  the  family  of  lambda 
distributions.  For  this,  we  first  discuss  tail-ordering  of  distribU' 
tions.  The  definition  of  a  tail-ordering  due  to  Doksum  (1969)  is  as 
follows: 

Definition  1.2.2.  Let  G  and  H  be  continuous  distributions  of  random 

variables  X  and  Y,  respectively.  Then  G  is  said  to  be  tail -ordered 

with  respect  to  H,  denoted  by  G  «<  H,  if  and  only  if  G(0)  =  H ( 0 )=  i 

t 

and  H_1[G(x)]  -  x  is  non-decreasing  on  the  support  of  G. 


For  symmetric  continuous  lambda  distributions  the 
following  theorem  holds. 


Theorem  1 .2.1 .  Let  F  and  G  be  symmetric  lambda 

distributions  with  location  parameters  e-j  ■  *  0,  scale  parameters 

6-,  and  ,  and  shape  parameters  y,  and  y,,  respectively,  where 


■y-j  iij-  If  6-j/yi  >  62/y2«  then 


F  <  G. 
t 


Proof.  Let  a(x)  =  G_1[F(x)]  -  x.  Then 


A(x)  »  [P(x)Y2  -  (1-F(x))>2]  - 

B2 


Thus 


A'(X)  =  4*1*1  =  ^  [F(x)Y2_1  +  (1-F(x))Y2_1]  ^Eiil 


dx 


Transforming  2  *  F(x),  we  have 


dFM 


B 


1 


dx  y-i-1  y-,-1 

Y-|  (z  1  +(1-2)  1  ) 


and  thus,  since  y}  >  y 2>  if  Bi /y-j  >  i>2hv 


,-l 


1  2  Z  1  +0-2)  ^ 


Yo'l  Y-i-Yo  Yo-1 

Z  2  (1-2  1  Z)+(1-2)  2  [MT-j 


- 


(Z  '  +(1-2)  '  ) 


>  o. 


This  completes  the  proof. 
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Ramberg  and  Schmeiser  (  1974  )  have  derived  the  kth  moment,  denoted 
by  of  the  lambda  distribution  with  e  =  0,  £,  y-j  and  y2  as  follows: 
When  exists. 


(1.2.11)  y'  =  ffk  T  (!}  )(>)1Be(Yl (k-1  )+l .  y2i+l) 
k  i=0  1  1  £ 


Here  by  using  the  method  of  moment  generating  functions,  the  first  4 
moments  of  the  sample  mean  based  on  n  independent  random  samples  from 
a  lambda  distribution  with  e  =  0,  &,  ^  and  >2,  where  £.,  y-j  and  y2  are 
chosen  so  that  the  moments  exist,  are  given  by  the  following  theorem. 


Theorem  1.2.2.  Let  Xp  denote  the  samde  mean  based  on  n  independent 
random  samples  from  a  lambda  distribution  with  location  parameter 
e  =  0,  scale  parameter  £  and  shape  parameters  and  y^.  If  values 
of  6,  y-j  and  v2  are  such  that  uj,  uj  and  exist,  then  they  are 
given  by 


(1.2.12) 


(1.2.13) 


,  SUM(1 ) 
U1  ~  8  ** 


,•  =  §UM£21  +  Ijvp.  sum2(1^ 


(1.2.14)  p.  .  +  (n,lJ(n-2JSte;ult 


where 


SUM  ( i )  =  l  (!)(-)jBe(y,(1-j)+l,  yj+l). 
j=0  J 

Proof.  From  the  fact  that 


n  (»)  * 


and 


*  j0TT  (??>isUB(i>- 

one  can  get  the  results  by  using  standard  methods,  where  cpx(t)  is  the 
moment  generating  function  of  a  random  variable  X  which  has  a  lambda 
distribution  with  parameters  £  -  0,  £.  and 

For  a  symmetric  lambda  distribution,  i.e.,  ^  =  y,  the 

following  corollary  holds. 


Corollary  1.2.3. 
anc  letting  y-j  =  ^  = 


Under  the  same  assumption  as  in  Theorem  1.2.? 
•> ,  the  following  equations  hold. 
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(1.2.17)  u2  =  m^, 

ne 

(1.2.18)  w3  =  0. 

(1.2.19)  u4  =  ~j  {SUM(4)  +  3(n-l)SUM2(2)}, 

n  6 

and 

(1 .2.20)  —  =  SUM(4)  +  3(n-l  )SUM2(2) 

4  n  SUM2 ( 2 ) 

Proof.  Since  Sl)M(i)  *  0  for  all  i  odd  for  >■)  *  >2  *  y '  one  can  9et 
the  results  from  Theorem  1.2.2  and  hence  the  proof  is  omitted. 

For  a  symmetric  lambda  distribution,  the  following  remarks  can 
be  made. 

Remarks : 

(1)  From  Corollary  1.2.3,  one  can  see  that  the  limiting  distribution 

of  Xn  has  kurtosis  3  which  is  the  same  value  as  that  of  a  normal  distri¬ 
bution. 

(2)  The  Corollary  1.2.3  can  be  utilized  to  approximate  the  distribu¬ 
tion  of  the  sample  mean  of  some  symmetric  continuous  distributions 
which  are  not  infinitely  divisible.  Goel  (1974  )  has  derived  the  dis¬ 
tribution  of  the  sample  mean  from  a  logistic  population  as  a  series  by 
using  the  method  of  characteristic  functions  and  has  provided  tables  the 
cdf  for  n  *  2(1)12  at  points  0.00(0.01)3.99  and  n  *  13(1)15  at  points 


1.2(0.01)3.89.  Using  the  result  of  Corollary  1.2.3,  the  cdf  of  the 
logistic  sample  mean  was  approximated.  It  was  seen  that  the  maximum 
difference  was  less  than  0.00155  for  all  values  of  n.  This  maximum 
error  occurs  at  the  point  x  ■  0.6  for  all  the  values  of  n.  For 
x  >_  1.0,  the  error  decreases  as  x  increases  and  for  x  t  [1.2,  3.9]  the 
maximum  error  is  less  than  0.0007  for  all  n.  The  above  discussion 
shows  that  the  distribution  of  the  sample  mean  of  a  logistic  population 
can  be  approximated  very  well  by  using  the  lambda  distribution. 

1.3  Selecting  the  Population  with  the  Largest  Location  Parameter 
Based  on  Sample  Medians 


1.3.1.  The  Proposed  Rule  R^  for  Subset  Selection  -  Symmetric  Case 
Let  *1 ,n2,. . . ,nk  be  k(>^  2)  independent  populations  which  are 
characterized  by  observable  random  variables  X1 .X^,. . . ,Xk>  respectively. 
Let  X^  follow  a  symmetric  lambda  distribution  with  an  unknown  location 
parameter  e..,  and  common  known  second  and  fourth  central  moments 
and  p^,  i  *  l,2,...,k,  respectively.  This  implies  that  the  random 
variables  X/s  have  common  known  scale  and  shape  parameters  8  and  > , 
respectively,  given  by  equations  (1.2.7)  and  (1.2.9).  Also  without  loss 
of  generality,  we  may  assume  uj  '  1.  Let  f ( • | )  and  F ( - j e ^ )  denote 
the  probability  density  function  (pdf)  and  cdf  of  a  random  variable 
and  let  X^,  j  *  l,2,...,n  be  n  independent  observations  from  i^, 
i  *  l,2,...,k,  respectively.  Let  n  ■  {e  *  (e^ , . . .  ,ek)  elRk}  be  the 
parameter  space  and  let  jJq  =  {e  *...*  ek  a  e^}.  Let 

6[1]  -  e[2]  -•  • *-  e[k]  denote  the  ordered  e^’s.  The  population 
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associated  with  is  called  the  best  population.  Also  let 

denote  the  population  corresponding  to  e[i]*  It  is  assumed  that 

no  prior  knowledge  is  available  for  the  correct  pairing  between 

and  ^..j,  i  =  l,2,...,k.  Our  goal  is  to  select  a  nontrivial  (nonempty) 

subset  including  the  best  population  so  as  to  satisfy  the  P*-condition, 

i.e.,  inf  P  (CS|R)  >  P*,  where  CS  stands  for  a  correct  selection 
efca  2 

i.e.  a  selection  of  any  subset  which  includes  the  best.  For  conven¬ 
ience,  let  n  *  2m+l ,  m  >  1,  and  let  X.  be  the  sample  median  of  it.. 

—  l  :m  i 

Let  X  r  i  -I  <  Xr,,-.  <...<  Xr.  ^  be  ordered  X.  's.  It  is  well  known 

LlJ  .m—  LZJ:m—  —  [.Kjim  i  :m 

that  a  sample  median  X.  has  a  pdf  and  a  cdf 

(1.3.1)  g(x|e.)  =  [F(x!e.)]m[l-F(xle  )]mf(x!e  ) 

1  (m!) 

and 

(1.3.2)  G(x | ei )  *  Ijr(x]e . )(m+1  ■  m+1)* 

respectively,  where  Ix(a,b)  is  an  incomplete  beta  function  with 
parameters  a  and  b.  Let  be  the  sample  median  corresponding  to 

6[i]  ’ 

Now  we  propose  the  following  selection  rule  R^: 

Rt:  Select  it.  if  and  only  if  X.  >  Xr,n  -  d„, 

T  i  i :m  -  [k] :m  O’ 

where  dg  (_>  0)  is  chosen  so  as  to  satisfy  the  P*-condition.  Without 
loss  of  generality,  we  can  assume  that  u0  *  0  in  Under  this 
assumption,  let  G(*)  and  g(*)  denote  the  cdf  and  pdf  of  the  sample 
median,  respectively.  Also  under  this  assumption,  let  f(*)  and  F(-) 


denote  the  pdf  and  cdf  of  X ^ ,  respectively.  Then  the  following 


theorem  holds. 


Theorem  1.3.1.  For  the  rule  Rj, 

(1.3.3)  inf  P.(CS|RT)  *  inf  P.(CS|R,) 


[1-F(x)]mf(x)dx 


Proof. 


inf  Pe(CS|Rj)  «  inf  Pe(*^  is  selected!  R-j.) 
0€£  • 


Pr<X(k):«i  X(j):m-d0'  J  '  ' . k-1> 


-  k-1 


inf  /  n  G(x+er. -,-er  .-,+df.)g(x)dx 
e€n  j=l  LKJ  LJJ  u 


=  /  Gk_1 (x+d0)g(x)dx 


_  j  IkJ  \(m+l,  m+1  )[F(x)]m  • 

(m!  /  -«  ^X+V 


[1-F(x)ff(x)dx. 


Hence  the  proof  is  complete. 


Values  of  dg  =  dg(k,m,P*)  can  be  obtained  for  various  values  of 
k,m  and  P*  by  solving  for  the  smallest  value  of  dQ  satisfying  the 
following  equation 
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0.3.4)  linUlU  f  ^-1  .  (m+l  ,m+l  )[F(x)]m[l-F(x)]mf(x)dx  =  P* 

(m! )  -»  V 


or 


(1.3.5) 


(2m+l)! 

(ml)1 


}  Ik‘J  v  v  (m+l ,m+l )[t(l-t)]mdt  »  P*. 
0  F[1  (tY-(l-t)Y)+d0] 


Using  (1.3.5)  values  of  d^  were  computed.  These  are  given  in 

Table  1.2  for  m=  1(1)5,  k  *  2,3(2)9,10,11,  P*  =  0.90,  0.95  and  for 

2 

specified  values  of  kurtosis  (u^^)  *  4-6.  5.0,  5.6  and  7.0  with 


1.3.2.  Properties  and  Performance  of  the  Proposed  Procedure  Rj 
Now  we  give  some  well-known  definitions:  Let  pi  denote  the 
probability  that  i$  selected  by  a  selection  rule  R. 

Definition  1.3.1. 

(a)  The  rule  R  is  strongly  monotone  in  r ^  if  p^  is  nondecreasing 
in  when  all  other  components  but  6^-j  are  kept  fixed  and  p^  is 
nonincreasing  in  e^-j  for  each  j  +  i  when  all  other  components  are 
kept  fixed. 

(b)  For  e  €  «,  R  is  said  to  be  monotone  if  p.  <_  p.  for  1  <  i  <  j  <_  k 

“  l  J 

(c)  For  e  €  n  and  1  <_  i  ^  k,  R  is  said  to  be  unbiased  if  p..  <_  pfc. 
Note  that  strong  monotonicity  for  all  i  ®  monotonicity  ■»  unbiasedness 

(d)  Let  <t'i(y1,y2»...»yk}  be  the  probability  that  ir^  is  selected  by 
using  any  selection  rule  R  based  on  statistics  y1  .y^,. -  - »yk •  Then  R 
is  said  to  be  invariant  (symmetric)  if 


V 

*! 


♦1(y1  »y  t  *  *  *  *  *  • '  •  ~ 


Now  we  have  the  following  theorem. 


Theorem  1.3.2. 


(a)  The  proposed  selection  procedure  Ry  is  strongly  monotone  in 
ti^  ^ ,  for  all  i  =  1 ,2, . . .  ,k. 

(b)  The  rule  Ry  is  monotone  and  unbiased. 

(c)  The  procedure  Ry  is  invariant. 

Proof,  (a)  The  result  follows  from  the  fact  that 


(1.3.6)  Pi  •  Pr{X(1);ro  >  X(j);m-d0,  j«l . k,  jl*i) 

°°  k 

~  j  H  G(a+6^j-e j-j+dQ)dG(x) . 

Also  the  proofs  of  (b)  and  (c)  follow  from  (1.3.6).  Thus  the 
proof  is  complete. 

The  expected  size  of  the  selected  subset  for  the  rule  Ry, 
Ee(S|Ry),  is  given  by 

k 

(1.3.7)  E0  C  S I  Ry )  =  l  Pr{n^j  is  selected) 


k  «  k 

=  A-,  J  r  G(x+do+e[i]-6|-j.j)dG(x). 
3*' 


Hence,  by  using  the  same  argument  as  in  Gupta  (1965),  one  can  prove 
the  following  theorem. 


Theorem  1.3.3.  For  given  k  and  P*(l/k  <  P*  <  1), 

(1.3.8)  sup  E.(S!RT)  =  sup  E.(S|RT)  =  k/  Gk_1(x+dn)dG(x)  =  kP 

eta  2  T  efe0  2  T  —  0 

Note  that  both  inf  P(CS|Rj)  and  sup  E&(S|RT)  do  not  depend  on  the 

r* 

common  6  f.g.  From  (c)  of  Theorem  1.3.2  and  Theorem  1.3.3,  the 
following  theorem  holds. 

Theorem  1.3.4.  The  procedure  Ry  is  minimax  among  all  invariant 
rules  satisfying  the  P*-condition. 

Proof.  For  e0  €  r.Q, 


(1.3.9)  inf  Pft(CS|RT)  »  inf  Pp(CS|RT)  *  P  (CSjRy)  =  P* 

^  T  e6n0  2  '  ®o  1 

and 

(1.3.10)  sup  E  (S | Rt)  =  sup  E_(S|Rt)  *  Efi  (S|Ry)  *  kP*. 

2  1  eBi0  2  '  ®o  1 


Also  for  any  invariant  (symmetric)  rule  R  and  6p  €  £, 


k 

(1.3.11)  E_  (SIR)  =  l  Pr{n/ . v  being  selected|R} 
2q  i-1 


k 

=  I  (CS!R). 


k 


*yk)C  .R  g(yj)]dy1dy2...dy 

1  1 


k 


Hence  for  9Q  € 


0.3.12)  E  (S | R)-E  (SIR,)  =  k{P.  (CS|R)-P.  (CS|RT)}. 

-0  §0  T  ®o  -0  " 

Since  the  procedure  R  satisfies  the  P*-condition,  from  equation 
(1.3.12),  one  can  see  that 

£  (S(R)>  E  ( S | RT )  =  sup  E  (S | Rt) 

-0  ^0  '  e -  T 


so  that 

(1.3.13)  sup  E  ( S ! R)  >  sup  EA(S|RT). 

e€n  -  ~  een  -  '  ' 

Hence  the  proof  is  complete. 

Now  under  a  slippage  configuration,  that  is,  6^  *  e[k-l]  = 
6j-k-j-6,  where  6  >  0,  the  asymptotic  relative  efficiency  (ARE)  of  the 
proposed  rule  Ry  relative  to  the  Gupta-type  procedure  Rg,  which  will 
be  defined  later,  will  be  discussed.  First,  the  definition  of  the 
ARE  is  given  as  follows. 

Definition  1.3.2.  Under  a  slippage  configuration  with  f  >  0,  let  S' 
be  the  number  of  non-best  populations  selected.  Also  given  0  -  i 
let  n^(e)  and  n^U)  be  minimum  numbers  of  observations  so  that 

(1.3.14)  Ee(s‘ |Ri )  =  c,  i  =  1,2, 

for  procedures  R-j  and  R^.  Then  the  ARE  of  the  rule  R£  relative  to  R 
is  defined  by 


(1.3.15) 


i 


ARE(R,,R, |6)  =  lim 
6  e+0 


n2(e)’ 


provided  that  both  procedures  R-j  and  R2  satisfy  the  P*-condition.  In 

the  sequel,  without  loss  of  generality  it  will  be  assumed  that 

e[l]  =  e[k-l]  =  6[k]"'5  =  Also  *he  Supta-type  procedure  Rg  is  defined 

by 


Rg :  Select  1^.  if  and  only  if  5^  _>  max  -  dg, 
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where  X^s  are  sample  means  and  dg  is  a  nonnegative  constant  chosen 
so  as  to  meet  the  P*-condition.  Let  n^  and  n£  be  the  sample  size  for 
procedures  RT  and  Rg,  respectively.  Then  as  n^  «  and  ng -*■■>,  one  can 
see  that,  by  use  of  the  central  limit  theorem. 


(1.3.16) 


QO 

inf  Pe(CS]Rg  )  «  / 
ete  -  * 


(x+dg*/nG)d«'(x), 


(1.3.17) 


inf  Pfl(CS|RT)  «  /V-l(x  +  -£)d#(x), 
e€n  5  7  ice  °T 


(1.3.18) 


E0(S'  1 RG)  **  (k-1 )/  $k'2(x+dg/n^)4>(x-(6-dg)v^)d<i'(x) , 

-  — oo 


and 


(1.3.19)  Ee(S’ |Rt)  *  (k-l)/"$k-2(x+d0/oT)$(x-(6-d0)/oT)d$(x), 

“  —oo 

where  al  =  l/4nTf2(0). 


As  e  +  0,  riy(e)  and  n^U)  become  sufficiently  large  and  thus  from  the 
equations  (1.3.16)  and  (1.3.17),  dgVnjT  *=  dg/oy.  Also  the  integrals 
of  the  right  hand  sides  of  equations  (1.3.13)  and  (1.3.19)  exist  and 


integrands  of  both  integrals  are  bounded  and  finite  on  F  .  Thus 


(1.3.20)  Ee(S‘!RG)  -  E6(S,)Rt) 


CD  _ 

/  <S  _2(x+dG^'){4>(x-(6-dr;)^)-i'(x-(6-dn)/oT);di'(x) 


G'  G; 


0;/  T 


*  0. 


Since  $(x)  is  strictly  increasing  in  x,  it  can  be  seen  that 


n~ ( c )  o 

-y  «=  4f^(0)  for  any  6  >  0. 


Hence  the  following  theorem  holds. 

Theorem  1.3.5.  Under  the  slippage  configuration  as  defined  above, 
(1.3.21)  ARE(Ry,  Rg! 6 )  =  f2(0) 

=  22(y-1)(6)2 

The  following  table  provi  les  ARE(Ry,  Rg|f)  for  various  values  of 

2 

B  and  v  for  the  following  values  of  kurtosis  *  1.8,  3.0,  4.2, 

5. 0(1.0)  9.0,  with  y,  =  1. 


Values  of  ARE(R,,  Rr  6) 


ARE(Rt,  Rg|«) 


0659x10  “  -.0363x10 


It  is  already  known  that  for  the  slippage  configuration,  ARE  s  of 
the  median  selection  rules  for  the  normal,  logistic  and  double  expo¬ 
nential  distributions  are  0.6366,  0.8225  and  l.OOOd,  respectively. 

On  the  other  hand,  for  values  of  kurtosis  3.0,  4.2,  and  6.0 
for  the  lambda  distribution,  the  corresponding  values  of  ARE(Rj,Rq|6) 
are  0.6454,  0.8235  and  0.9886,  respectively.  These  differences  are 
mainly  due  to  the  approximation  by  lambda  distributions  with  parame¬ 
ters  £  and  y  for  the  corresponding  distributions.  Also  one  can  see 
that  when  the  tail  of  the  distribution  becomes  heavier, 

ARE(Rj,  Rq j  6 )  increases  and  thus  the  rule  Rt  becomes  as  efficient  as 
the  procedure  Rfi  and  the  rule  RT  is  more  efficient  than  the  rule  Rfi 
for  very  heavy-tailed  distributions. 


Remark:  From  Theorem  1.2.1  and  Theorem  1.3.5  one  can  see  the  following: 
With  the  same  condition  as  in  Theorem  1.2.1  and  under  a  slippage  con¬ 
figuration,  the  ARE(Ry,  RG)6)  for  a  distribution  F-j  is  better  (larger) 
than  that  of  for  a  distribution  F£  when  F^  ■<  F^ . 

Now  the  performance  of  the  rule  Ry  will  be  discussed  in  terms  of 
P6 (CS| Ry),  Ee(S'|Ry)  and  Pg (CS | Ry )/Eg (S ‘ |Ry).  Recall  that  for  k 


(1.3.22)  Pe(CS|Ry)  = 


}  k n?I  ,  (m+l, m+l) 

(m!)  0  j-l  F[i  {tY-(l-t)Y>+d0+e[k]-e[j;i] 

•[t(l-t)]mdt. 


(1.3.23)  Ee(S|Ry)  =  ^  Pe{,,(i)  is  Se1  ected i Ry } 

*  Pe(CS I Ry)  +  E. { S ‘ iRy), 


(2m-*-!)!  1  k 


(1.3.24)  E(S,|Rt)  =  l  /  JT  I  , 

1-1  (m 0Z  0  j=l  F[1  (tY-(l-t)Y>+dn+er  .-,-er . 


(m+l ,m+l )[t(l-t)]mdt. 


Here  two  configurations  are  considered,  i.e.,  a  slippage  config¬ 
uration  =  6[k_i]  =  6[k]'lf  and  an  eRui"sPacecl  configuration 

6[1]  =  6[2]"r:  =  6[i]~^”^'5  r  ^[k]’^’1^4,  where  6  >  U^er  a 
slippage  configuration  equations  (1.3.22)  and  (1.3.24)  can  be 

simplified  as 


P  (CSjRy)  =  I2m+Tjj_  (m+l  ,m+l  )[t(l-t)]mdt 

-  T  (m!)2  0  F[|  {tv-(l-t)Ti+6+dA] 


29 


and 


Ee(s‘!RT)  • 


=  (k-l)  jjk-2  (m+l.m+l) 

(nil)2  0  F[I  {tY-(l-t)Y}+d0] 


I  ,  (m+1 ,m+l ) 

F lj  (tY-(l-t)Y}+d0-6] 

•  [t(l-t)fdt. 


Values  of  Pe(CS|RT),  Ee(S’ |Ry),  PQ (CS | RT)/E0 (S 1 | Ry )  and  ( S | Ry ) 

under  a  slippage  configuration  are  computed  for  6  =  0.1 (0.2)0. 5,1 .0, 

m  ■  1(2)5,  k  =  2, 5(2)9,  P*  =  0.90,  0.95  and  kurtosis  (u^/u^)  *  4.6, 

5.0,  5.6,  7.0  with  p2  =  1 •  These  are  given  in  Table  1.3.  Similarly, 

under  an equi -spaced  configuration,  values  of  P  (CS|RT),  EC(S'|RT), 

P0 ( CS ( R-j. )/ E0 ( S 1  |Ry)  and  E 0 ( S | R^ )  are  computed.  They  are  given  in 

Table  I.4for  6  *  0.1(0. 2)0. 5,  m  *  3,5,  k  *  5,7,  P*  *  0.90,  0.95  and 
2 

kurtosis  (u4/u2)  =  4.6,  5.6,  7.0.  Note  that,  for  k  =  2,  values  of 
P^CSIR^,  Ee(S'|RT),  Pe(CS|RT)/E6(S'|RT)  and  E0(S |RT>  under  an  equi- 
spaced  configuration  are  the  same  as  those  of  under  a  slippage  config¬ 
uration.  From  Table  1.3  and  Table  1.4,  the  following  remarks  can  be 
made: 

(1)  As  the  value  of  kurtosis  increases, values  of  P(CS(RT)/Ee(S* IR^) 
increase  and  hence  the  proposed  rule  R^  can  be  more  effective  for  heavy 
tailed  populations. 

(2)  Values  of  P0(CS | Rj)/E0(S *  j Ry )  for  P*  =  0.90  are  uniformly  larger 
than  those  for  P*  *  0.95  for  all  combinations  of  values  of  k,  m  and  6 
for  slippage  configurations  and  also  for  equi -spaced  configurations. 
This  may  be  mainly  the  reason  why  an  increase  in  the  value  of  P* 


causes  R^to  select  more  non-best  populations  compared  with  the 
improvement  on  PQ (CS | Ry ) - 

These  tabulated  values  can  help  in  an  optimal  choice  of  the  value 
of  P*  in  the  sense  of  (approximate)  maximizing  the  value  of  Pc(CS|RT) 

U  I 

and  (approximate)  minimizing  the  values  of  E0 (S * | RT) *  simultaneously. 
(3)  An  increase  in  the  values  of  6  decreases  the  values  of  E  (S'|RT) 
more  significantly  than  an  increase  in  the  values  of  m  for  both 
configurations.  Also  values  of  Eg(S|Rj)  decrease  substantially  as  6 
becomes  larger  for  both  configurations. 

1.3.3.  Selecting  the  t-Best  Populations  with  Indifference  Zone 
Approach-Symmetric  Case 

In  Section  1.3.1  the  subset  selection  approach  for  the  selection 
of  the  population  with  the  largest  location  parameter  is  considered. 

In  this  section,  the  indifference  zone  approach  to  select  the  t-best 
populations  for  the  family  of  synmetric  lambda  distributions  will  be 
studied.  Let  the  assumptions  and  notations  be  the  same  as  those  of 
Section  1.3.1  except  for  n  and  where  for  f*  >  0  and  1  <_  t  <  k,  let 

0(6*:  t)*  [ft 

and 

r.0(6*:  t)  *  <s  «  Kk|em  *  6[k.t]  *  e[k.t„r«*  * 


Then  our  goal  is  to  select  the  t-best  populations  associated  with 
6[k-t+l]’ "  * ,e[k]  without  regard  to  order,  and  to  satisfy  the  condition 
that  the  probability  of  selecting  t-best  populations  without  regard  to 


order  is  at  least  P*  for  given  <?*,  which  is  also  called  the  P*- 

|/ 

condition,  where  P*  €  (l/(t),l)  and  5*  are  specified  by  the  experimenter 
Then  the  selection  rule  Rj(t)  is  defined  as  follows. 

Rj(t):  Select  the  t  populations  associated  with  -m‘ 

Then  the  following  theorem  holds. 

Theorem  1.3.6.  For  6*  >  0, 


(1.3.25)  inf  P  (CS|RT(t))  =  inf  P(CS|RT(t)). 

e&2(6*:t)  -  §ea0(6*:t)  1 


Proof.  Proof  is  easy  and  hence  omitted. 


From  Theorem  1.3.6,  the  least  favorable  configuration  is  n«(6*:t), 
Also  the  minimum  size  of  samples  n^  which  guarantees  the  P*-condition 
is  the  smallest  integer  n  such  that 


(1.3.26) 


inf  Pp(CS|R.(t))  >  P*. 
e€P-0(6*:t)  -  1 


where 


(1.3.27) 


oo  -  _ 

inf  P  (C$ | Rt (t ) )  =  t /  Gk't(x+6*)(l-G(x))T>1dG(x) 
e€fio(«*.t)  -  1 


fik~*  (m+l,m+l)[l-I  (m+l.m+l)]*'1 

(m!)2  0  F[l(pY-(l-p)Y)+6*]  P 

[p(l-p)3mdp. 


Remark.  If  ^  is  not  assumed  equal  to  1 ,  <5*  in  the  equation  (1.3.27) 
should  be  replaced  with  6*/ 4*2* 


Table  1.5  provides  the  minimum  sample  sizes  for  selected  values  of 
kurtosis  *  3.0,  4.2,  5.6,  6.0,  7.0,  P*  =  0.90,  0.95,  k  =  2, 3(2)7, 

10,  t  *  1(1)3  (t  <  k),  and  6*  =  0.5  and  1.0  with  ^  =  1. 

1.4.  Applications  of  the  Lambda  Distribution 

In  this  section,  some  applications  of  the  lambda  distribution  for 
the  evaluation  of  the  d-values  of  subset  selection  approach  in  the 
selection  and  ranking  problem  are  carried  out.  Here  we  restrict  our 
attention  to  the  symmetric  case. 

As  mentioned  in  the  introduction  the  lambda  distribution  can 
approximate  theoretical  continuous  symmetric  distributions  if  values 
of  location,  scale  and  shape  parameters  are  chosen  properly.  The 
following  table  shows  values  of  scale  and  shape  parameters  e  and  >■, 
respectively,  with  which  the  lambda  distribution  can  be  used  to 
approximate  some  well-known  symmetric  distributions  with  =  1 . 


distribution 

- 71 - 

u4/u2 

6 

Y 

uniform 

1.80 

.5774 

1.0000 

normal 

3.00 

.1975 

.1349 

logistic 

4.20 

-.0659xl0"2 

-.0363x1  O'2 

Laplace 

6.00 

-.1686 

- . 0802 

t  with  5  df 

9.00 

-.3202 

-.1359 

t  with  10  df 

4.00 

.0261 

.0148 

t  with  34  df 

3.20 

.1563 

.1016 

Cauchy 

- 

-3.0674 

-1.0000 

33 


Remark:  For  the  case  of  Cauchy  distribution,  entries  come  from  the 
table  of  Ramberg  and  Schmeiser  (  1972  }. 

Now  we  consider  an  approximation  of  values  of  dg  of  the  procedure 
Rg  defined  in  Section  1.3.2  for  the  normal  model.  If  one  wants  to  use 
the  selection  rule  Rg,  one  needs  values  of  dg  and  these  values  are 
provided  by  many  authors  (for  example,  Gupta  (1956),  Gupta  (1963  ), 
Gupta,  Nagel  and  Panchapakesan  (1972  ),  among  others).  But  by  using 
the  lambda  distribution  one  can  approximate  values  of  dg,  denoted  by 
dg,  by  solving  the  equation 


0-4.1) 


/  Fk~‘ (x+dg)dF(x)  =  P*. 


where  F(.)  is  a  cdf  of  the  lambda  distribution  with  a  scale  parameter 
p  3  0.1975  and  a  shape  parameter  y  *  0.1349.  In  the  following  table 
values  of  dg  come  from  Gupta,  Nagel  and  Panchapakesan  (1972  )  and 
values  of  dg  are  evaluated  from  the  equation  (1.4.1). 


1.8125 

2.5997 

2.9301 


2.3262 

3.0551 

3.3678 


1.8126 

2.6024 

2.9339 


2.3279 

3.0596 

3.3728 


2 

3.2899 

3.2( 

5 

3.9196 

3.9 

4.1999 


4.2015 


From  the  above  table,  we  see  that  the  values  of  dg  are  fairly 
close  to  those  of  dg.  These  agree  to  at  least  two  decimal  places. 
Furthermore,  values  of  dg  are  conservative  (larger  than  values  of  dg); 
hence  the  P*-condition  will  not  be  violated  if  one  uses  dg-values  in 
place  of  dg-values. 

Now  we  consider  another  approximation  of  the  d-values  of  the  sub¬ 
set  selection  procedures  based  on  sample  medians  for  the  logistic 
distribution  and  compare  those  values  with  values  from  tables  of 
Lorenzen  and  McDonald  (  1981  ).  We  know  that  a  logistic  distribution 

can  be  approximated  by  a  lambda  distribution  with  a  scale  parameter 

-2  -? 

6  *  -0.0659x10  and  a  shape  parameter  >•  =  -0.0363x10  .  In  the 

following  table  values  of  d^  come  from  the  table  of  Lorenzen  and 

McDonald  (  1981  )  and  values  of  da  are  based  on  the  approximation 

a 

by  using  the  lambda  distribution. 


d. 

d 

d* 

d 

t 

a 

t 

a 

0.879 

0.879 

1 .137 

1.137 

1 .274 

1.273 

1.510 

1.510 

1.377 

1.376 

1.609 

1 .609 

0.599 

0.598 

0.771 

OTtTT 

0.863 

0.863 

1  .019 

1 .018 

0.931 

0.930 

1 .033 

1.083 

0.514 

0.513 

0.661 

0.661 

0.740 

0.739 

0.872 

0.872 

0.797 

0.797 

0.927 

0.926 

"057 

0.457 

0.588 

0387 

0.657 

0.657 

0.775 

0.774 

0.708 

0.708 

0.823 

0.882 

From  the  above  table,  we  can  see  that  the  approximation  by  using 

the  lambda  distribution  works  fairly  well.  The  values  agree  with  each 
other  at  least  to  two  decimal  places  and  for  many  cases  they  agree  up 
to  three  decimal  places. 

Based  on  the  comparisons  made  so  far  it  can  be  concluded  that 
approximations  based  on  the  lambda  distribution  with  proper  values 
of  scale  and  shape  parameters  work  very  well  and  we  may  not  need 
tables  for  selection  procedures  for  different  distributions. 

More  generally,  for  any  (parametric)  statistical  inference  problem, 
one  may  use  the  lambda  distribution  model  to  get  approximate  good 
results.  This  advantage  may  be  useful  for  some  package  programs 
on  selection  and  ranking  problems  mentioned  in  the  introduction. 
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Table  1.1 

Values  of  i  and  .  of  the  Tukey's  symmetric  lambda  distribution  for  given 
kurtosis  and  unit  variance 


kurtosis 

8  "i _ 

kurtosis 

_ £ _  *T 

1.8 

.5773503 

1.0000000 

1.9 

.5360259 

.7315156 

2.0 

.4951808 

.5843119 

2.1 

.4563041 

4839393 

2.2 

4197244 

.4092117 

2.3 

.3854375 

.3506705 

2.4 

.3533229 

.3032138 

2.5 

.3232217 

.2637705 

2.6 

.2949687 

.2303522 

2.7 

.2684053 

.2016015 

2.8 

.2433846 

.1765539 

2.9 

.2197734 

1545019 

3.0 

.1974514 

.1349125 

3.1 

.1763108 

1173758 

3.2 

.1562549 

.1015705 

3.3 

.1371972 

.0872407 

3.4 

.1190600 

.0741800 

3.5 

.1017736 

.0622194 

3.6 

.0852749 

.0512197 

3.7 

.0695075 

.0410645 

3.8 

.0544199 

.0316561 

3.9 

.0399657 

.0229114 

4.0 

.0261027 

.0147597 

4  1 

.0127925 

.0071401 

4.2 

-.0006589 

-.0003630 

4.3 

-.0123069 

-.0067065 

4.4 

-.0241574 

-.0130192 

4.5 

-.0355787 

-.0189735 

4.6 

-.0465955 

-.0246001 

4.7 

-.0572307 

-.0299266 

4.8 

-  0675053 

-.0349774 

4.9 

-.0774389 

-.0397743 

5.0 

-.0870496 

-.0443366 

5.1 

-.0963542 

-  0486820 

5.2 

-.1053681 

-.0528262 

5.3 

-.1141060 

-.0567834 

5.4 

-.1225813 

-.0605666 

5.5 

-.1308066 

-  0641874 

5.6 

-.1387938 

-.0676566 

5.7 

-.1465539 

-.0709839 

5.8 

-.1540971 

-.0741781 

5.9 

-.1614332 

-  0772475 

6  0 

-.1685712 

-.0801994 

6.1 

-.1755197 

-.0830410 

6.2 

-.1822868 

-.0857783 

6.3 

-.1888799 

-.0884174 

6.4 

-.1953064 

-.0909637 

6.5 

-.2015728 

-.0934222 

6.6 

-.2076855 

-.0957974 

6.7 

-.2136507 

-.0980939 

6.8 

-.2194739 

-.1003156 

6.9 

-.2251605 

-.1024662 

7.0 

-.2307158 

-.1045492 

7.1 

-.2361444 

-.1065680 

7.2 

-.2414511 

-.1085255 

7.3 

-.2466402 

-  1104247 

7.4 

-  2517159 

-.1122682 

7.5 

-.2566820 

-.1140586 

7.6 

-.2615425 

-.1157981 

7.7 

.2663008 

-.1174891 

7.8 

-.2709605 

-.1191336 

7.9 

-.2755247 

-.1207336  ! 

8.0 

-.2799966 

-.1222909 

8.1 

-.2843791 

-.1238074 

8.2 

-.2886751 

-  12528:6 

8.3 

-.2928874 

-.1267242 

8.4 

-.2970185 

-.1281275 

8.5 

-.3010709 

-.1294961 

8.6 

-.3050470 

-.1308313 

8.7 

-.3089491 

-.1321343 

8.8 

-.3127794 

-.1334063 

8.9 

-.3165400 

-  1346484 

9.0 

-.3202329 

-.1358618 

Table  1.2 


Values  of  dp  for  the  Procedure  Ry  with  ^  =  1. 


m4 

-?  »  4.6 
u2 

( 

e,  y)  = 

(-0.0466 

,  -0.0246) 

2 

3 

5 

7 

9 

10 

1.0970 

1.4282 

1.3599 

1.6788 

1.6026 

1.9139 

1 . 7380 
2.0462 

1.8317 

2.1382 

1.8696 

2.1755 

0.8606 

1.1148 

1 .0640 
1.3064 

1 .2492 

1 .4836 

1.3511 

1.5821 

1.4210 

1 .6500 

1 .4491 
1.6774 

0.7305 

0.9440 

0.9021 

1.1046 

1.0571 

1.2520 

1.1417 

1.3334 

1.1996 

1.3893 

1 .2227 
1.4117 

0.6455 

0.8330 

0.7966 

0.9739 

0.9325 

1.1027 

1.0064 

1  .1734 

1.0567 

1  .2219 

1 .0768 
1 .2413 

0.5846 

0.7537 

0.7210 

0.8806 

0.8434 

0.9963 

0.9098 

1 .0597 

0.9549 

1  .1030 

0.9729 
1 .1204 

u4 

=  5.0 

u2 

(£,  v)  = 

(-0.0870 

,  -0.0443) 

2 

3 

5 

7 

9 

10 

1 .0798 
1.4085 

1.3399 

1.6575 

1.5813 

1 .8924 

1.7166 

2.0252 

1.8107 

2.1180 

1 .8488 
2.1557 

0.8451 

1 .0960 

1.0455 

1 .2853 

1.2285 

1 .4609 

1 .3295 
1.5589 

1.3990 

1.6266 

1.4270 

1.6539 

0.7165 

0.9267 

0.8852 

1.0849 

1.0380 

1.2305 

1.1216 

1.3111 

1.1788 

1.3665 

1.2018 

1.3887 

0.6328 

0.8171 

0.7811 

0.9557 

0.9148 

1.0825 

0.9876 

1 .1524 

1.2373 

1.2003 

1 .0572 

1 .2195 

11 

1.9033 

2.2088 

1.4740 

1.7017 

1.2433 

1.4316 

1.0946 

1.2585 

0.9883 
1 .1357 


11 

1.8827 

2.1893 

1 .4518 
1 .6782 

1.2221 

1.4085 

1 .0748 
1 .2365 


0.5728  0.7067  0.8270  0.8923  0.9367  0.9545  0.9702 

0.7389  0.8636  0.9774  1.0400  1.0826  1.0998  1.1150 


Table  1.2  (continued) 


=  5.6 


(-0.1389,  -0.0667) 


2 

3 

5 

7 

9 

10 

11 

1 .0589 

1 .3845 

1 .3156 
1.6315 

1 .5553 

1 .8661 

1.6905 

2.0000 

1 .7849 
2.0934 

1 .8233 
2.1315 

1.8575 

2.1656 

0.8264 

1 .0732 

1 .0231 

1  .2597 

1 .2035 

1 .4334 

1 .2828 
1.5064 

1.3506 

1  .5727 

1 .4001 

1  .5996 

1  .4023 
1 .6234 

0.6997 

0.9059 

0.8649 

1.0611 

1 .0149 

1 .2045 

1 .0973 

1 .2840 

1 .1537 

1 .3388 

1.1764 

1.3609 

1.1965 
1 .3805 

0.6175 

0.7979 

0.7625 

0.9336 

0.8135 

1.0582 

0.9500 

1.1093 

0.9980 

1.1558 

1.0335 

1.1745 

1 .0344 
1.1910 

0.5586 

0.7210 

0.6894 

0.8430 

0.8071 

0.9546 

0.8712 

1.0160 

0.9148 

1.0580 

0.9323 

1.0749 

0.9477 
1 .0900 

u4 

-j  =  7.0 
y2 

(6,  v)  = 

(-0.2306 

,  -0.1045) 

2 

3 

5 

7 

9 

10 

11 

1 .0231 

1 .3427 

1 .2736 

1 .5861 

1 .5101 

1  .8196 

1  .6448 

1  .9540 

1 .7395 
2.0489 

1.7782 

2.0877 

1 .8127 
2.1225 

0.7947 

0.9851 

1.1608 

1.2587 

1.3266 

1.3541 

1 .3785 

1.0345  1.2159  1.3862  1.4820  1.5488  1.5759  1.6000 


0.6714  0.8306 
0.8706  1.0209 

0.5917  0.7312 
0.7656  0.8965 

0.5349  0.6605 
0.6911  0.8086 


0.9759  1 
1.1604  1 

0.8576  0 
1.0172  1 

0.7739  0 
0.9164  0 


.0560  1.1111 
.2380  1.2917 

.9270  0.9744 
.0840  1.1300 

.8357  0.8780 
.9758  1.0166 


1.1334  1.1531 

1.3134  1.3327 

0.9935  1.0104 
1.1486  1.1650 

0.8949  0.9099 
1.0330  1.0475 


.9776 
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CHAPTER  II 


ISOTONIC  PROCEDURES  FOR  SELECTING  POPULATIONS 
BETTER  THAN  A  CONTROL  FOR  TUKEY ' S  GENERALIZED 
LAMBDA  DISTRIBUTIONS  AND  LOGISTIC  DISTRIBUTIONS 

2.1  Introduction 

The  problem  of  selecting  a  subset  containing  all  populations 
better  than  a  control  or  standard  has  been  considered  by  many  authors 
under  different  formulations.  Dunnett  (  1955  ),  Gupta  and  Sobel 
(  1958  ),  Gupta  (  1965  ),  Rizvi,  Sobel  and  Woodworth  (1968  ), 

Bechhofer  (  1968  ).  Huang  (  1 974  ) »  Naik  (1975),  Turnbull  (1976  ), 
Brostrflm  (  1977  ),  and  Gupta  and  Singh  (  1979  )  have  studied  this 
problem.  Using  a  decision-theoretic  Bayesian  approach,  Gupta  and 
Kim  (1980),  Gupta  and  Hsiao  (1981  ),  Gupta  and  Miescke  (1984) 
have  also  considered  this  problem.  For  further  references,  see 
Gupta  and  Panchapakesan  (  1979  )  and  Dudewicz  and  Koo  (  1982  ). 

However,  most  of  these  papers  assume  that  there  is  no  knowledge 
about  the  correct  ordering  among  unknown  parameters.  But  in 
practice,  there  are  cases  where  the  experimenter  may  know  the 
correct  ordering  even  though  the  values  of  parameters  are  unknown. 

For  example, in  the  pharmacological  studies,  a  higher  amount  of 
acetaminophen  in  the  pain  reliever  will  result  in  a  quicker  effect 
on  relieving  fever.  In  this  situation,  when  the  experimenter 


considers  the  time  taken  to  reduce  the  temperature  to  a  certain 
degree  as  a  measurement  of  the  effect,  the  experimenter  knows  the 
correct  ordering  among  several  pain  relievers  with  different 
amounts  of  acetaminophen  even  though  the  true  values  of  the  times 
are  unknown.  For  this  case  then,  it  is  reasonable  to  assume  an 
ordering  prior.  Selection  procedures  under  the  assumption  of  ordering 
priors  are,  in  general,  concerned  with  isotonic  inference.  Recently 
Gupta  and  Yang  (1984)  have  considered  isotonic  selection  procedures 
for  the  case  of  normal  populations.  They  have  also  considered  some 
isotonic  procedures  under  the  assumption  of  partial  ordering.  Gupta 
and  Huang  (1983  )  have  studied  isotonic  procedures  for  the  case  of 
binomial  populations  and  Gupta  and  Leu  (1983b)  have  proposed  and 
studied  isotonic  selection  procedures  for  unknown  guarantee  lifetimes 
in  the  case  of  two-parameter  exponential  populations.  Huang  (  1984) 
has  also  proposed  and  studied  a  nonparametric  isotonic  selection 
procedure. 

In  this  chapter  we  investigate  isotonic  selection  procedures  for 
the  family  of  lambda  distributions  and  for  the  logistic  populations. 

As  pointed  out  earlier,  the  lambda  family  of  distribution  was 
defined  by  Tukey  (1960)  and  generalized  by  Ramberg  and  Schmeiser 
(1972,  1974).  It  is  well  known  that  the  lambda  family  of  distri¬ 
butions  can  be  used  to  approximate  many  univariate  continuous 
distributions  very  well  as  shown  in  Chapter  1.  For  further  dis¬ 
cussion  relating  to  the  lambda  family  of  distributions,  reference 
should  be  made  to  Section  1.2  of  Chapter  1.  Here  we  restrict 


ourselves  to  the  family  of  symmetric  lambda  distributions.  We  also 
study  the  logistic  distribution  which  is  frequently  used  as  a  model 
in  biological  assay  problems,  (see  for  example,  Berkson  (1944,  1951, 
1953)  and  Finney  (1947  )). 

In  Section  2.2,  we  introduce  notations  and  definitions  used 
in  this  chapter. 

In  Section  2.3,  some  isotonic  selection  procedures  are  proposed 
and  studied  for  symmetric  lambda  populations  and  for  the  logistic  pop 
ulations.  Especially,  we  investigate  the  approximations  of  constants 
used  in  the  proposed  procedures  mainly  because  of  difficulties  in¬ 
volved  in  obtaining  the  exact  distribution  of  sums  of  sample  medians. 
For  both  the  lambda  distribution  and  the  logistic  distribution, 
moments  of  sums  of  sample  medians  are  derived. 

2.2  Preliminaries 

Let  ttq,  n^,...,Ttk  be  (k+1)  independent  populations,  where  tt0 
can  be  regarded  as  a  control  or  standard  population.  Let  a  random 
variable  Xi  be  the  observable  characteristic  of  ir.  and  let  X^, 
j  =  l,2,...,n  be  n  independent  random  samples  from  7^,  i  *  l,...,k, 
respectively.  Let  F( - 1 ei ,  s)  be  a  cumulative  distribution  function 
(cdf)  of  the  random  variable  X^ ,  where  ei  is  an  unknown  location 
parameter  that  we  are  interested  in  and  K  is  a  vector  of  nuisance 
parameters  which  are  assumed  to  be  common  and  known.  For  the  lambda 
populations,  §  is  a  vector  of  the  common  known  scale  and  shape 
parameters  and  for  the  logistic  populations,  \  is  a  common  known 
variance.  The  value  of  eQ  associated  with  hq  may  or  may  not  be 
known.  A  population  n,  is  said  to  be  "good"  ("bad")  if  >_  (<)en. 
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Assume  that  we  have  a  simple  ordering  prior  of  Without 

loss  of  generality,  let  e^.  Of  course,  the  true  values 

of  e/s  are  unknown.  Our  goal  is  to  select  a  nontrivial  subset  which 
includes  all  good  populations  with  the  requirement  that  the  minimum 
probability  of  a  correct  selection  (CS)  be  at  least  equal  to  a 
preassigned  number  P*. 

Let  a  =  {e  =  (eQ.e^  ,  ...,ek)!  •  »  <  e,  <  e2  <...<  ^  «•, 

k+1 

-  »  <  6q  <  be  the  parameter  space,  where  n  c  p  .  Also  let  us 
def i ne 


*0  =  {-  €  ":-k  <  V” 


’  i  -  "'Vi  '  °0  -  6k-i+l'’  1  =  1.2,...,k-l, 


r..  *  {e  l  file.  .  <  e. 


and 


r<k  =  {e  £  n j  6q  <  e i } . 


Then  r^'s  are  mutually  disjoint  sets  and  $7  = 
some  definitions. 


k 


u 

i=0 


We  now  give 


Definition  2.2.1.  A  selection  procedure  R  is  called  isotonic  if 
and  only  if  whenever  it  selects  wi  Lh  it  also  selects  ^  when 

’i  "j’ 

Definition  2.2.2.  A  real-valued  function  f  defined  on  a  poset  (S,  <), 
where  <  denotes  a  binary  partial  order  on  a  set  S,  is  called  isotonic 

*\j 

if  f  preserves  the  partial  order  on  S. 

Definition  2.2.3.  Let  g  be  a  given  function  on  (S,  <)  and  let  W  be 
a  given  positive  function  on  (S,  <).  An  isotonic  function  g*  on 
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(S,  <)  is  called  an  isotonic  regression  of  g  with  weights  W  if  it 

p 

minimizes  the  sum  l  [g(x)-g*(x)]  W(x)  over  a  class  of  all  isotonic 
x€S 

functions  on  S. 

From  Barlow,  Bartholomew,  Bremner  and  Brunk  (1972),  it  is  known 

that  there  exists  one  and  only  one  isotonic  regression  of  a  given  g 

with  weights  W  on  S  when  S  is  simply  ordered.  Also  the  isotonic 

estimator  of  e..  can  be  found  by  using  the  max-min  formulas  given  by 

Ayer,  Brunk,  Ewing,  Reid  and  Silverman  (1955)  as  follows. 

Let  be  the  sample  median  of  *.  based  on  n  independent  random 

samples  X^,...,X^n,  i  =  l,2,...,k,  respectively.  For  convenience, 

let  n  =  2m+l ,  m  >_  0,  and  let  the  common  known  variance  be  1  for  both 

2 

lambda  and  logistic  populations.  Also  let  C  denote  the  conrnon  known 
variance  of  X^  Let  us  define  a  finite  set  S  *  { 6^ , . . .  ,ek | 

and  let  W(e^)  =  wi  *  n,  i  =  1,2 . k,  respectively.  Then  by  the 

max-min  formulas,  the  isotonic  regression  of  g  with  weight  W  is  g*. 


where 


g*(e.)  *  max  min 
1  1 <s< i  s<t<k 


xs  +...+  xt 


Hence  the  isotonic  estimator  Xi>k  of  ei  is 

*  * 

i.k  1<s<.  S.k 


Xs:k  "  min  Xs’ 


Wl 


xs+-+  Xk 
k-s+1“  ’ 


for  i  =  l,2,...,k,  respectively. 


•k-  «. 


We  give  the  following  definition  for  the  sake  of  completeness. 

Definition  2.2.4.  Let  F ( -  i 6 ^ .  £)  be  a  symmetric  lambda  family  of 
distributions.  Then,  for  §  =  (£,  y)  and  0  <_  u  <_  1 , 


(2.2.1) 


F-1 (u)  =  ei  +  }  [uY-(l-u)Y], 


where  is  a  location  parameter,  e  is  a  scale  parameter  and  y  is  a 
shape  parameter. 

For  further  discussion  on  the  properties  of  the  family  of  lambda 
distributions,  reference  should  be  made  to  Section  1.2  of  Chapter  1. 


2.3.  Proposed  Procedures  and  R^. 

We  confine  ourselves  to  the  class  of  isotonic  procedures  which 
satisfy  the  P*-condi tion ,  i.e.,  for  an  isotonic  rule  R, 


(2.3.1) 


inf  P0(CS!R)  >  P*. 
e€C  - 


2.3.1.  Definitions  of  the  Proposed  Rules  R-j  and  R^ 

The  cases  of  both  known  and  unknown  are  considered. 

(A)  6q  known 

Since  6g  is  known,  no  samples  need  to  be  taken  from  the  control 
population  Tig.  Now  the  rule  is  proposed  as  follows: 

Procedure  R-| :  Steps  i  =  l,2,...,k-l,  are  defined  as  follows: 

Step  i .  Select  the  subset  ,1^}  and  stop  if 


-  CdS]k- 


Jewrw 
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otherwise  reject  r.  and  go  to  Step  i+1 ,  and 


Step  k.  Select  if 


*k:k  -  e0  '  Cdk:k* 


otherwise  reject  irk  and  decide  that  none  of  k  populations  are 
good . 

Here  dP£,  i  *  l,2,...,k  are  chosen  to  be  the  smallest 
non-negative  constants  so  that  the  procedure  R-j  is  isotonic  and 
meets  the  P*-condition.  Since 


(2.3.2)  inf  P.(CS|R,)  ■  inf  inf  P_(CS|R, ) , 
e€fl  -  1  l<i<k  - 

the  P*-condition  is  equivalent  to 


(2.3.3)  inf  P. (CS(R, )  >  P*,  for  i  •  1 . k. 

eea1  $  1  - 

Also,  for  any  e  6  ,  1  <_  i  <_  k,  let 

L\ '  5  » •  *  ’  ’  k-i+1  (’ 


Zi:k  '  "1n 


where 


-  Xi  •  ei 

Zi  >=  -c-  ,  i  =  l,...,k. 


Then 

(2.3.4) 


P0 (cs I «i )  -  Pe 


k-i+1  . 


k-i+1  j  i  m 

j-Ul 


Hk>; 


which  is  non-decreasing  in  e  t  *  *  1 . k-i+1.  Thus 


(2.3.5)  WFPs(CS!R,)»  Pr(ik.f+1:k,  -d‘:jt,!k). 
Also  one  can  see  that 

(2.3.6)  Inf  %«5|K,)  <  Ps.[T'(*j;k  1  «<,-“$>} 


Pr{Zk-i+l :k  -  "dk-i+l:k}’ 


where  e*  = 


(Sq,-® . .  6q>  .  - .  > 6q)  . 

i  terms 


Since  Zk_i+1.k  has  the  same  distribution  as  ,  the  following 


theorem  holds. 


Theorem  2.3.1.  For  given  P*(0  <  P*  <  1)  and  e  (  Q., 

(2.3.7)  inf  Pg(CS|R1 )  *  Pr{Z1;i  >  -djl}+1;k},  i  =  1 . k. 


From  Theorem  2.3.1,  one  can  get  the  following  corollary. 


Corollary  2.3.1.  For  a  given  P*(0  <  P*  <  1),  d^.1-+-|.(c  which  is 
the  solution  of  the  equation 

Pr(Z1;i  >  -z)  -  P* 

satisfies  the  P* -condition  for  the  procedure  R, . 


Proof.  The  proof  is  straightforward  and  hence  omitted. 


The  evaluation  of  the  constants  dk_j+^.k  will  be  discussed  in 
the  next  section. 


Remarks: 

‘  s  (1 ) 

(1)  Since  Zk_i+.|.k  has  the  same  distribution  as  d£_:+.|.k 

d  ^  ^  i  =  1  2  k 

(2)  It  can  be  seen  that  dj]j  is  increasing  in  i. 


(B)  eQ  unknown 

Since  is  unknown,  n  independent  observations  XQ1 . XQn 

from  the  control  population  iig  are  taken.  Let  XQ  denote  the  median 
of  the  above  samples.  Then  the  selection  procedure  is  defined 
as  follows: 


Procedure  R Steps  i  =  l,...,k-l,  are  defined  as  follows: 

Step  i.  Select  the  subset  {r^,...,nk}  and  stop  if 

,<i:ki*o-Cdi?k- 

otherwise  reject  ir^  and  go  to  Step  i+1,  and 
Step  k.  Select  7?k  only  and  stop  if 
Xk:k  >  X0  -  Cdk?k, 

otherwise  reject  -n.  and  decide  that  none  of  them  are  good  populations 
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Now  similar  to  Theorem  2.3.1,  the  following  theorem  holds. 


Theorem  2.3.2.  For  given  P*(0  <  P*  <  1 )  and  9 
(2.3.8)  inf  P6(CS|R2)  =  Pr{Z1;.  >  v4?i+l:k}’  1  = 

0  £T> .  - 

where  ZQ  =  (X0-e0)/C. 

Proof.  The  proof  is  analogous  to  that  of  Theorem  2.3.1  and  hence 
omitted. 

Corollary  2.3.2.  For  given  P*(0  <  P*  <  1),  dj^|+, .k>  which  is  the 
solution  of  the  equation 


(2.3.9) 


PrfZi^  >  Zq-T)  -  P*. 


satisfies  the  P*-condition  for  the  rule  R, 


Proof.  The  proof  is  straightforward  and  hence  omitted. 


The  evaluation  of  the  constants  will  be  discussed 

in  the  following  section. 


(21  (2) 

Remark:  It  can  be  seen  that  for  i  =  l,...,k,  =  d}.(  and 


(21 

also  d}.(  is  i 


ncreasing  in  i 


2.3.2.  The  Evaluation  of  Constants  dj^j^  ,|;  and  dj.2  ■  t  ^ 

(21 

Since  the  evaluation  of  constants  d£_(+1.k  is  similar  to  that 
of  constants  dj^j^.^,  we  will  discuss  here  only  the  evaluation  of 
constants  dj^|+1.k. 


\  \  •-  %  *. 


Now  to  solve  the  equation 


(2.3.10)  Pr{Z1;i  >  -z]  =  P*. 

the  following  lemmas  are  needed.  First  the  lenma  due  to  Gupta  and 
Yang  (  1984)  based  on  the  theory  of  random  walk  will  be  cited  with 
out  proof. 

Lemma  2.3.1.  Suppose  U^,  are  l id  random  variables  whose 

distribution  is  not  concentrated  on  a  half-axis.  Let  SQ  *  0, 

Sj  =  U-j  +...+  Uj,  j  *  1,2,...,  respectively  and  let  Ui  *  T^x, 

where  E(T.j)  *  0,  for  i  *  1,2,...,  respectively.  Let 

V,  =  min  1  S  .  Then 
J  l<r<J 

(2.3.11)  Pr(V1+1  >  x)  -  ^  j  Pr(Vj  >  x)Pr(Se_j+1  >  0), 

3*0 

where  P r(Vg  ^  x)  =  1  for  all  x. 

To  use  Lenma  2.3.1,  first  it  is  necessary  to  evaluate  the 
quantity  Pr (SA_j+1  >_  0),  where  for  ease  of  notation  denotes  the 
sum  of  j  iid  sample  medians  for  both  symmetric  lambda  and  logistic 
populations.  To  find  the  exact  and  closed  form  of  distribution  of 
Sj  is  very  difficult.  Hence  one  can  consider  several  ways  to 

J 

approximate  the  quantity  Pr(SJl_j+1  >_  0),  for  example,  (i)  Cornish- 
Fisher  expansion  (ii)  Monte  Carlo  Method  (iii)  Approximation  by 
using  a  lambda  distribution.  Since  the  lambda  family  of  distribu¬ 
tions  can  be  used  to  approximate  many  theoretical  distributions 
very  well,  provided  that  the  values  of  scale  and  shape  parameters 
are  properly  chosen  (based  on  the  standardized  second  and  fourth 


moments),  the  method  of  approximation  by  a  lambda  distribution  will 
be  applied.  Hence  it  is  necessary  to  compute  the  second  and  fourth 
central  moments  of  the  sum  of  k  sample  medians  from  k  iid  symmetric 
lambda  distributions  with  mean  0  and  variance  1.  The  same  problem 
for  the  case  of  logistic  distributions  will  be  discussed  later. 


Lenina  2.3.2.  Let  u  be  the  rth  central  moments  of  the  sum  of  k 
-  r 

sample  medians  from  k  iid  distributions  based  on  a  common  sample 
size  n  =  2m+1 ,  m  0.  Then  for  k  symmetric  lambda  distributions 
with  common  scale  and  shape  parameters  b  and  >,  respecti vely , 


(2.3.12) 


2k  r(2m+2)  [r(m+l  )"(m+i+2Y)-[r(m+1+-Y] 
e2[r(m+l )]2  r(2nH2+2y) 


(2  1-13)  ,  -  1 2k( k-1 )  j  r(2m+2)  f  (r(m+l  )T(m+U2y)~[r(m+U2ynZ{2* 

4  6 4  |[r(m+l)?f  1  y)  f 


2kr(2m+2) _ 

e4[r(m+l )]2r(2m+2+4Y) 


{r(m+l )r(m+l+4y) 


-4r(m^1+Y)r(m+l+3Y)  +  3[r(rn+l+2Y)]  I, 


where  r( • )  is  a  gamma  function. 


Proof.  Let  ^(t)  be  the  moment  generating  function  of  the  sum  of  k 
iid  sample  medians.  Then  it  is  well  known  that  cpk ( t: )  =  [^(t)]  . 
Also  one  can  get  that 


Be(m+l+jY ,  m+l+i-Y) , 


(2.3.14)  tp  1  ( t ) 


y  y  [-)  t 

[r(m+l)]  ja0  £=0  s.J j J bj+£ 


where  Be(a,b)  is  a  complete  beta  function  with  parameters  a  and  b. 
Thus  by  the  standard  method,  one  gets  the  result.  Hence  the  proof 
is  complete. 


Remark: 

In  addition  to  Lemma  2.3.2,  is  computed  and  is  given  as 
follow: 


(2.3.15) 


+ 


r(2m+2) 
[r(m+l )] 


where 


(2.3.16)  A1  =  ^  {Be(m+1,  m+l+2Y)  -  Be(m+1+Y,  m+l+Y)}, 

a 

(2.3.17)  A?  a  K  {Be(m+1 ,  m+l+4Y)  -  4Be(m+l+Y,  m+l+3Y) 

8 

+  3Be(m+l+2Y,  m+l+2Y)t, 


and 

(2.3.18)  A-  *  {Be(m+1,  m+l+6Y)  -  6Be(m+l+Y,  m+l+5Y) 

+  15Be(m+l+2Y,  m+1+4Y)  - 


-  10Be(m+l+3Y,  m+l+3Y)}. 


This  result  for  ug  (and  higher  moments)  can  be  used  if  one  wants 
to  use  the  Cornish-Fisher  expansion. 

To  find  the  proper  values  of  the  scale  and  shape  parameters  of 
a  lambda  distribution  from  Lemma  2.3.2,  values  of  kurtosis  for  the 
sum  of  k  sample  medians  based  on  n  =  2m+l  samples  from  lambda  dis¬ 
tribution  with  mean  0  and  variance  1  are  given  in  Table  II. 1  for 
k  =  1(1)5(2)11,  15,  20  and  m  =  0(1)5(2)9,10(5)20,30,50  when  the 
underlying  lambda  distributions  have  common  kurtosis  4.6,  6.0  and 
7.0.  Furthermore,  based  on  Lemma  2.3.1  and  Lemma  2.3.2,  values  of 
*4-1+1 -k  ^or  t*ie  ^ambt*a  populations  are  computed.  They  are  given 
in  Table  II. 2  for  m  =  0(1)3(2)9,10,  P*  =  0.75,  0.90,  0.95,  .099  when 
the  underlying  lambda  populations  have  common  variance  1  and  common 
kurtosis  4.6,  6.0  and  7.0. 

For  the  case  of  logistic  population,  the  following  lemma,  which 
is  similar  to  Lemma  2.3.2,  holds. 


Lemma  2.3.3.  Let  n  =  2m+l ,  m  >_  0  be  the  common  sample  size  of  k 
iid  logistic  populations.  Then  the  second  and  fourth  central  moments 
of  the  sum  of  k  sample  medians  from  k  logistic  population  are: 


(2.3.19) 

and 


2k  ,1 

7l * 


-  ) 


72} 


.  =  1 


(2  3  20)  .  -  iMklll  rL.  y  b 

U-J.cu;  ^  l9Q  I  4-1 


a 

2  2  m  ,  0  4  m  , 

[(^  -  4,  -  A  > 


12k 


where  a  =  v/^. 


Proof.  Noting  the  fact  that 


(2.3.21)  <pk(t)  =  ^  [1  -  (^f-)2]'k. 


the  proof  is  analogous  to  that  of  Lemma  2.3.2  and  hence  omitted. 


Similar  to  the  case  of  lambda  populations,  values  of  kurtosis 
for  the  sum  of  k  sample  medians  based  on  n  =  2m+l  samples  from 
logistic  distributions  with  common  variance  1  are  computed.  These 
are  given  in  Table  11.3  for  k  =  1(1)5(2)11,  15,  20  and 
m  =  0(1)5(2)9,  10(5)20,  30,  50.  Also  based  on  Lemma  2.3.1  and 
Lemma  2.3.3  values  of  dj^.k  for  the  logistic  populations  are  computed 
These  are  tabulated  in  Table  II. 4  for  m  =  0(1 )3(2)9,  10,  P*  =  0.75, 
0.90,  0.95,  0.99  and  k  =  1(1)7. 


2.3.3.  Expected  Number  of  Bad  Populations  in  the  Selected  Subset. 

Suppose  6q  is  known  and  thus,  without  loss  of  generality,  let 
9 q  =  0.  Let  B  be  the  random  size  of  bad  populations  in  the  subset 
selected  by  the  procedure  .  Then  the  expected  number  of  bad 
populations  due  to  the  selection  procedure  ,  denoted  by  E  (B|R^), 
can  be  used  as  a  measure  of  the  efficiency  of  the  rule  R^ .  Now 
for  any  j ,  0  <_  j  _<  k, 


(2.3.22)  sup  Ep(B|R, ) 
8€«k-j  • 


sup  i  p  {  u  (x.  .  >  -Cd;.h) 

e€<Vj  k=l  §  i=l  uk  1*k 

j  r  :  m 

I  Pr{  u  (Z.  .  >  -d  !')}. 
r=l  i=l  1'J  KK 


Also  under  the  same  assumption  as  that  of  the  rule  let  us 
consider  an  alternative  rule  R,  which  uses  a  fixed  constant  d,  and 
selects  a  subset  simultaneously.  This  rule  is 

R^:  Select  if  and  only  if  -  Cd^  for  i  =  1  ,2,. . .  ,k, 

where  d^^  0)  is  chosen  so  as  to  satisfy  the  P*-condition.  Then 

one  can  see  that  d,  =  d  P  /  and  also 
3  1 :  k 


(2.3.23)  sup  E,(BjR,)  =  J  Pr-  .  (Z.  .  >  -  dj>. 
^“k-j  '  r=1  i  =  1  J 


Now  the  following  theorem  holds. 


Theorem  2.3.3.  For  any  j ,  0  <  j  <■  k , 


(2.3.24)  sup  Ec(BjR1)1sup  E„(BlR,). 

Ur  ^  i  '  cxr  "  ^ 

-  k-j 


^-•k-j 


Proof.  The  proof  is  straightforward  and  is  based  on  the  fact  that 


dO)  =  dO)  <  dO)  *  d 

dj:k  dl:k-j+l  -  dl:k  d3‘ 


From  the  above  theorem  P,  is  uniformly  better  than  R^  in  terms 


of  the  number  of  bad  populations  ir,  the  selected  subset. 


2.3.4.  Another  Procedure  R^. 


Since  the  lambda  family  of  distributions  is  not  infinitely 


divisible,  it  is  very  hard  to  find  the  exact  closed  form  of  the 
distribution  of  the  mean  of  samples  from  the  lambda  distribution. 
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This  is  also  true  for  the  logistic  distribution.  But  as  we  have 
discussed  in  Chapter  1,  the  lambda  distribution  can  be  used  to 
approximate  a  univariate  continuous  theoretical  distribution  precise 
enough,  and  thus  we  can  use  a  lambda  distribution  to  approximate  the 
distribution  of  the  sample  mean  by  computing  its  second  and  fourth 
moments.  Thus,  when  this  kind  of  approximation  is  acceptable,  we 
can  consider  another  isotonic  procedure  R^  based  on  sample  means 
instead  of  sample  medians.  Here  we  consider  the  case  of  lambda 
populations  with  6g  known.  Now  we  define  the  isotonic  procedure 
Rm  as  follows: 

Procedure  R„:  Steps  i  =  l,...,k-l,  are  defined  as  follows: 

Step  i.  Select  a  subset  {*.,...  ,1^}  and  stop  if 

CM-  ,  r  .M 
Ai  :k  -  60  '  Svi  :k’ 

otherwise  rejct  and  go  to  Step  i+1, 
and 

Step  k.  Select  *k  and  stop  if 

;M  r  ,M 

Xk:k  -  60  '  CMdk:k’ 

otherwise  reject  and  decide  that  none  of  populations  are  good. 


where 


Ai-k=  max  Xc • k  * 
i.K  1<s<i  s . k 


"M  -  ^S  +  *’’+^k 

Xs:k  =  min{Xs .  k^s+1 


and 

Var(X.)  =  Cj. 
v 

Here  d.j.k  are  the  smallest  nonnegative  constants  such  that  the 
procedure  R^  is  isotonic  and  meets  the  P*-condition. 

Now  similar  to  that  for  the  procedure  ,  the  following  theorem 
holds. 


Theorem  2.3.4.  For  given  P*(0  ^  P*  <  1 ) ,  ^  which  is  the 

solution  of  the  equation 


(2.3.26) 


-2-  =  P* 


satisfies  the  P*-condition  for  the  orocedure  RM,  where 


Proof.  The  proof  is  similar  to  that  of  Corollary  2.3.1  and  hence 
omitted. 


To  solve  the  equation  (2.3.26),  we  can  use  the  same  method  as 
that  in  Section  2.3.2  and  thus  it  is  necessary  to  compute  second  and 
fourth  moments  of  the  sum  of  k  sample  means  based  on  n  independent 
observations  from  each  of  the  k  populations.  Then  the  following 
theorem  holds. 


Theorem  2.3.5.  Let  be  the  ith  central  moment  of  the  sum  of  k 
sample  means  based  on  n  independent  samples  from  each  of  the  k 
lambda  distributions  with  a  common  scale  parameter  e  and  a  common 
shape  parameter  y.  Assume  that  the  common  variance  of  k  lambda 
distributions  is  1.  Then 

_  k  sum(2) 


v.  =  4-t  (sum(4)  +  3(kn-l)sum2(2)}, 
nJsq 

where 

sum(i)  =  l  (j)(-)jBe(Y(i-j)+l,  yj+1), 
j=0  3 


where  Be(a,b)  is  a  complete  Beta  function  with  parameters  a  and  b. 
Proof.  The  proof  is  straightforward. 
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Table  II. 2 

Values  of  for  the  case  of  symmetric  lambda 
populations  with  common  kurtosis  and  common  variance  1 

Kurtosis  =4.6 


0.75 

0.90 

0.95 

0.99 

0.5920 

1.1949 

1.6141 

2.5688 

0.7382 

1.2879 

1 .6796 

2.5929 

0.7836 

1.3087 

1 .6899 

2.5938 

0.8029 

1.3148 

1.6918 

2.5939 

0.8123 

1.3167 

1 .6922 

2.5939 

0.8174 

1.3174 

1.6922 

2.5939 

0.3860 

0.7614 

1.0081 

1.5278 

0.4745 

0.8109 

1.0393 

1.5358 

0.5005 

0.8209 

1 .0433 

1 .5360 

0.5111 

0.8236 

1.0439 

1.5360 

0.5161 

0.8242 

1 .0440 

1.5360 

0.5187 

0.8244 

1.0440 

1.5360 

0.3077 

0.6008 

0.7885 

1.1670 

0.3758 

0.6368 

0.8100 

1.1747 

0.3954 

0.6437 

0.8126 

1.1748 

0.4032 

0.6454 

0.8130 

1.1748 

0.4069 

0.6459 

0.8130 

1.1748 

0.4088 

0.6460 

0.8130 

1.1748 

0.2634 

0.5115 

0.6682 

0.9804 

0.3259 

0.5408 

0.6853 

0.9839 

0.3369 

0.5463 

0.6873 

0.9839 

0.3433 

0.5477 

0.6875 

0.9839 

0.3463 

0.5480 

0.6875 

0.9839 

0.3478 

0.5481 

0.6875 

0.9839 

0.2127 

0.4107 

0.5339 

0.7743 

0.2579 

0.4331 

0.5466 

0.7767 

0.2706 

0.4372 

0.5480 

0.7767 

0.2756 

0.4382 

0.5482 

0.7767 

0.2780 

0.4384 

0.5482 

0.7767 

0.2791 

0.4385 

0.5482 

0.7767 

Table  II 

.2  (continued) 

Kurtosis  =6.0 

0.75 

0.90 

0 

0.1457 

0.2803 

0. 

0.1762 

0.2952 

0. 

0.1848 

0.2979 

0. 

0.1882 

0.2985 

0. 

0.1897 

0.2987 

0. 

0.1905 

0.2987 

0. 

Kurtosis  =  7.0 

0.3701 

0.3738 

0.3756 


6 


0.5965 

0.5970 

0.5972 


0.75 

0.90 

0.95 

0.99 

0.1930 

0.3747 

0.4893 

0.7172 

0.2349 

0.3961 

0.5017 

0.7197 

0.2468 

0.4001 

0.5031 

0.7197 

0.2515 

0.4010 

0.5033 

0.7197 

0.2537 

0.4013 

0.5033 

0.7197 

0.2548 

0.4014 

0.5033 

0.7197 

0.1662 

0.3213 

0.4181 

0.6076 

0.2017 

0.3390 

0.4281 

0.6095 

0.2117 

0.3423 

0.4293 

0.6095 

0.2157 

0.3431 

0.4294 

0.6095 

0.2175 

0.3433 

0.4294 

0.6095 

0.2184 

0.3433 

0.4294 

0.6095 

0.1482 

0.2857 

0.3709 

0.5363 

0.1795 

0.3011 

0.3796 

0.5379 

0.1883 

0.3039 

0.3806 

0.5379 

0.1918 

0.3046 

0.3807 

0.5379 

0.1934 

0.3048 

0.3807 

0.5379 

0.1942 

0.3048 

0.3807 

0.5379 

0.1411 

0.2718 

0.3526 

0.5090 

0.1786 

0.2864 

0.3508 

0.5104 

0.1792 

0.2890 

0.3617 

0.5104 

0.1825 

0.2896 

0.3618 

0.5104 

0.1840 

0.2898 

0.3618 

0.5104 

0.1847 

0.2898 

0.3618 

0.5104 

■  r.v  <*  *  »  *  .  ■ 

V’  .  v  ’  t*'. 
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Table  II. 3 


Table  H .4 


Values  of  dj]j|  for  the 

m  k  P* 

0  1 

2 

3 

4 

5 

6 
7 

1  1 

2 

3 

4 

5 

6 
7 

2  1 

2 

3 

4 

5 

6 
7 

3  1 

2 

3 

4 

5 

6 
7 

5  1 

2 

3 

4 

5 

6 
7 


logistic  populations 


0.75  0.90 

0.6047  1.2120 
0.7516  1.2983 
0.7957  1.3186 
0.8135  1.3227 
0.8223  1.3227 
0.8276  1.3227 
0.8298  1.3227 

0.3961  0.7776 
0.4854  0.8263 
0.5114  0.8358 
0.5219  0.8382 
0.5269  0.8389 
0.5294  0.8391 
0.5308  0.8392 

0.3158  0.6147 
0.3849  0.6506 
0.4047  0.6573 
0.4126  0.6591 
0.4162  0.6595 
0.4181  0.6596 
0.4191  0.6597 

0.2704  0.5239 
0.3286  0.5533 
0.3451  0.5587 
0.3516  0.5601 
0.3546  0.5604 
0.3562  0.5605 
0.3570  0.5605 

0.2183  0.4209 
0.2645  0.4436 
0.2774  0.4478 
0.2825  0.4488 
0.2850  0.4490 
0.2861  0.4491 
0.2867  0.4491 


with  common  variance  1 


0.95 

0.99 

1.6240 

2.5349 

1 .6836 

2.5523 

1.6900 

2.5523 

1.6930 

2.5523 

1.6930 

2.5523 

1.6930 

2.5523 

1.6930 

2.5523 

1.0253 

1.5385 

1.0552 

1.5456 

1.0590 

1.5457 

1.0595 

1.5457 

1.0596 

1.5457 

1.0596 

1 .5457 

1.0596 

1.5457 

0.8046 

1.1862 

0.8258 

1.1901 

0.8282 

1.1901 

0.8286 

1.1901 

0.8286 

1.1901 

0.8286 

1.1901 

0.8286 

1.1901 

0.6830 

0.9973 

0.6999 

1.0006 

0.7018 

1.0006 

0.7021 

1.0006 

0.7021 

1.0006 

0.7021 

1.0006 

0.7021 

1.0006 

0.5465 

0.7902 

0.5593 

0.7925 

0.5607 

0.7925 

0.5608 

0.7925 

0.7925 

0.5609 

0.7925 

0.5609 

0.7925 

Table  II. 4  (continued) 


0.75 

0.90 

0.95 

0.99 

0.1881 

0.3616 

0.4685 

0.6740 

0.2274 

0.3807 

0.4791 

0.6759 

0.  2384 

0.3842 

0.4803 

0.6759 

0.2428 

0.3850 

0.4804 

0.6759 

0.2447 

0.3852 

0.4804 

0.6759 

0.2457 

0.3852 

0.4804 

0.6759 

0.2463 

0.3852 

0.4804 

0.6759 

0.1617 

0.3219 

0.4165 

0.5974 

0.2026 

0.3387 

0.4258 

0.5990 

0.2123 

0.3417 

0.4258 

0.5990 

0.2160 

0.3424 

0.4268 

0.5990 

0.2178 

0.3426 

0.4268 

0.5990 

0.2187 

0.3426 

0.4268 

0.5990 

0.2192 

0.3426 

0.4268 

0.5990 

0.1596 

0.3064 

0.3963 

0.5678 

0.1928 

0.3223 

0.4050 

0.5692 

0.2021 

0.3251 

0.4059 

0.5693 

0.2057 

0.3258 

0.4061 

0.5693 

0.2073 

0.3260 

0.4061 

0.5693 

0.2082 

0.3260 

0.4061 

0.5693 

0.2086 

0.3260 

0.4061 

C . 5693 

CHAPTER  III 


NONPARAMETRIC  SELECTION  PROCEDURES  AND 
THEIR  EFFICIENCY  COMPARISONS 


3.1.  Introduction 

Since  the  selection  and  ranking  problems  were  introduced  and 
formulated,  many  papers  have  been  concerned  with  nonparametric 
selection  procedures.  Since,  in  practice,  there  are  many  situations 
in  which  one  cannot  observe  the  complete  samples  because  of  lack  of 
resources,  such  as  time,  budget,  unexpected  accidents,  but  one  can 
at  least  observe  ranks.  This  kind  of  difficulty  occurs  in  life¬ 
testing  very  freouently.  Also  realistically  the  underlying  distribu 
tions  of  populations  are  almost  unknown  to  the  experimenter  and 
hence  sometimes  a  parametric  approach  to  the  testing  hypotheses  prob 
lems  or  other  inference  problems  is  sensitive  to  the  assumptions  on 
the  underlying  distributions.  Thus,  to  avoid  these  deficiencies  of 
the  parametric  approaches,  nonparametric  approaches  are  frequently 
used.  These  can  provide  robustness  against  deviations  from  the 
assumptions  about  the  underlying  distributions. 

Some  nonparametric  selection  procedures  in  terms  of  quantiles 
were  considered  by  Rizvi  and  Sobel  (  1967  ),  Barlow  and  Gupta  (  1969  ) 
among  others.  Also  nonparametric  subset  selection  procedures  based 


on  ranks  were  studied  by  Nagel  (  1970),  McDonald  (1969,  1972,  1973, 
1975),  Gupta  and  McDonald  (  1970),  Hsu  (1978,  1981),  Gupta,  Huang 
and  Nagel  (1979  ),  Huang  and  Panchapakesan  (  1982),  Gupta  and  Leu 
0983a),  Gupta  and  Liang  (  1984  )  and  Matsui  (1984),  among  others. 

Also,  Bartlett  and  Govindarajulu  (1968  )  have  studied  locally  optimal 
procedures  based  on  ranks  even  though  the  functional  forms  of  the 
underlying  distributions  are  assumed  to  be  known. 

Nagel  (1970  )  and  Gupta  and  McDonald  (1970  )  proposed  and  studied 
some  nonparametric  subset  selection  procedures  for  the  location  and 
scale  models  which  choose  a  subset  including  the  best  population  among 
k  populations.  The  latter  authors  considered  locally  optimal  selection 
procedures  based  on  some  functions.  But  the  optimal  choice  of  the 
score  function  for  these  procedures  has  not  been  studied.  Since  the 
rank  sum  statistic  is  easy  to  deal  with,  many  proposed  nonparametric 
subset  selection  proceducres  are  based  on  this  statistic. 

In  this  chapter  we  consider  the  problem  of  choosing  the  optimal 
score  function  for  different  procedures  proposed  by  Nagel  (1970)  and 
Gupta  and  McDonald  (1970  ).  The  Tukey's  lambda  family  of  distribu¬ 
tions  is  considered  as  the  distribution  for  the  score  function 
because  this  family  of  distributions  can  be  used  to  approximate  many 
theoretical  (unimodal)  continuous  distributions  and  hence  it  is  easy 
to  deal  with. 

In  Section  3.2,  the  problem  of  selection  and  ranking  with 
nonparametric  subset  selection  procedures  is  formulated  and  notations 
and  definitions  including  proposed  procedures  are  given. 


In  Section  3.3,  we  evaluate  those  procedures  and  compute 
constants  which  are  necessary  to  carry  out  the  procedures.  Also 
the  score  function  which  leads  the  procedures  to  be  locally  optimal 
in  the  neighborhood  of  some  points  is  introduced  and  evaluated. 

A  Monte  Carlo  study  for  the  optimal  choice  of  the  score  function 
is  carried  out  in  Section  3.4.  This  study  indicates  that  the  score 
function  based  on  uniform  distribution  is  optimal  and  robust  against 
possible  deviations  from  the  underlying  distributions.  Also  the 
score  function  which  is  a  weighted  sum  of  ranks  turn  out  to  be 
optimal  for  some  procedures.  Furthermore,  it  shows  that  the  Gupta- 
type  procedure  is  almost  uniformly  better  than  another  available 
procedure.  This  is  not  the  same  conclusion  as  that  in  Gupta  and 
McDonald  (l 970  )•  The  reason  why  these  results  are  different  is 
due  to  the  lack  of  number  of  simulations  in  Gupta  and  McDonald 
(  1970)  for  various  underlying  populations.  Also  it  is  due  to  the 
fact  that  they  only  use  the  rank  sum  statistics.  Some  tables 
including  the  values  of  score  functions  are  constructed.  Also  some 
tables  containing  the  results  of  simulations  are  provided. 

3.2  Formulation 

Let  be  k(>_  2)  independent  populations  and  let  Xi  be 

an  observable  characteristic  of  i  =  l,2,...,k,  respectively. 

Assume  that  a  random  variable  follows  a  continuous  distribution 

F(.)e.),  and  that  the  family  (F(-je)}  is  stochastically  increasing  in 

6.  Here  we  assume  that  the  e.  are  unknown  location  parameters.  Let 

X..,  j  =  l,...,n  be  n  independent  random  observations  from 
i  3 


’’.j  >  i  *  l,Z,...,k.  Let  R..  denote  the  rank  of  the  observation  X.. 
in  the  pooled  sample  of  kn  observations.  Define 


(3.2.1)  nH.  =  l  a(Ri  J,  i  =  1,2 . k, 

1  j=l  1J 


where  a(r)  is  a  score  function  defined  by 


»  <  a(r)  =  E(T(r) j G)  <  ®, 


where  T(l)  <_  T(2)  . . <_  T(N)  is  an  ordered  sample  of  size  N  =  nk  from 

a  continuous  distribution  G.  Let  er,-,  •  6rm  <...<  6r,n  be  the 

[1]  -  [2]  -  -  [k] 

ordered  e.'s.  Since  the  family  f F ( x ; e ) >  is  stochastically  increasing 


in  e, 


F(x  j  6r,  1  )  i  f  (x)br«n)  >_.  .  .>_  F  (  X  |  6  r.,-1  ) 


for  any  x  e  IR  • 

The  population  associated  with  i.e.  F(x  j  6^^-j ) »  is  called 

the  best.  In  case  several  populations  have  the  same  largest  value 
e j- »  randomly  one  of  them  is  tagged  as  the  best.  Our  goal  is  to 
select  a  subset  which  contains  the  best  with  the  usual  requirement 
on  the  probability  of  a  correct  selection  (PCS),  i.e.,  for  any 
procedure  R, 


(3.2.2) 


inf  P  CCS  IR)  P*, 
- 


where  »■  =  { e I e  -  (e,,...,e.),  f  F  is  the  parameter  space. 


Gupta  and  McDonald  (1970)  proposed  procedures  R-j(G)  and  R3(G), 
which  choose  a  subset  containing  the  best,  and  which  depend  on  the 
choice  of  G,  as  follows: 

R,  (G) :  Select  r.  if  and  only  if  H.  ^  max  H.-d,  i  =  l,2,...,k, 

i  j  j 

and 

R3(G):  Select  ^  if  and  only  if  >_  D,  i  =  1,2,..., k, 

where  d(^  0)  and  D{-  ®  <  D  <  ®)  are  chosen  so  as  to  meet  the  P*- 
condition. 

Note  that  rules  R-|(G)  and  R^ ( G )  are  equivalent  if  k  =  2.  Also 
the  rule  ( G)  may  select  an  empty  set.  A  usual  choice  of  G  is  a 
uniform  distribution  which  is  appealing  because  of  simplicity. 

Let  be  the  population  associated  with  Sj-.j.  It  is  easy 
to  see  that,  for  rules  R-j(G)  and  R3(G), 

(3.2.3)  Pr(CS|R.,(G))  ■  Pr(Hjk)  >  max  -  d,  j  =  l,...,k-l) 

J 

and 

(3.2.4)  Pr(CS|R3(G))  =  Pr(H(k)  >  D), 

where  is  the  H^.  associated  with  i  *  l,2,...,k,  respectively. 


3.3.  Comparison  of  the  Procedures  R-j(G)  and  R3(G). 

In  order  to  compare  R^(G)  and  R^ ( G)  for  various  choices  of  G, 
we  need  first  the  results  relating  to  the  infimum  of  the  PCS  and 
evaluation  of  necessary  constants. 


PCS  for  R 


G)  and  Evaluation  of  Associated  Constants 


■j(G)  and  R^( 

We  state  below  (without  proof)  the  results  regarding  the 
infimum  of  PCS  for  rules  ( G )  and  R^ ( G )  obtained  by  Gupta  and 
McDonald  (1970). 

Theorem  3.3.1.  For  procedures  R^(G)  and  R^G), 

(3.3.1)  inf  P  (CS|R.(G))  =  inf  Pfi(CS|R-(G)),  j  =  1,3, 

J  agftk  2  J 

and  further,  for  the  procedure  R3(G), 

(3.3.2)  inf  P  (CS|1UG))  =  inf  P,  (CS  |R,(G) ) , 

eefi  -  eefi0  -  J 

where  r<k  =  (e  €  n|e[k_i;)  =  e[k]}  and  =  (e  €  =...=  e[k]t 

Remark:  When  e  €  J2q,  procedures  R^(G)  and  R3(G)  are  distribution- 

free  in  the  sense  that  the  distributions  of  the  statistics 

max  H.  -  H.  and  H.  do  not  depend  upon  the  underlying  distribution 
1  <_j_<k  J 

F(-Ie). 

In  general,  the  least  favorable  configuration  (IFC)  of  the  rule 
R-j(G)  is  unknown  except  for  k  =  2;  however,  it  is  known  (see  Rizvi 
and  Woodworth  (1970  ))  that  the  LFC  need  not  occur  in  r.p.  In  order 
to  compare  rules  R^(G)  and  >’.3{G),  for  various  choices  of  G,  the 
constants  d  and  D  are  chosen  to  yield  approximately  the  same  P*  when 
0  €  The  ratio  EFF(R)  =  P(CS!R)/E(S|R)  is  used  to  compare  the 
rules,  where  E ( S ) R )  is  the  expected  size  of  the  subset  selected. 

Now,  taking  G  to  be  a  symmetric  lambda  distribution  with 
location  parameter  a,  scale  parameter  £  and  shape  parameter  y,  for 


e  6  Gg,  we  have  the  following: 
(3.3.3)  a(r)  =  E(T(r)jG) 


(3.3.5)  l  H,  =  ok. 

1-1  1 

Now,  let  a(r)  =  a+cr-  When  N  s  2m+l ,  m  ^  0,  we  have  from 
(3.3.3) 


52m+l  =  'er"“W  =  'Cm*  cm+l  =  °* 
In  this  case,  we  obtain 

(3.3.6)  E ( H  ^ )  =  a. 

(3.3.7)  n2Var(H. )  *  2.^k~1)  £ 

k  (N-l)  j=m+2  3 

-2 

(3.3.8)  n2Cov(Hi,Hj)  =  - 
ind 

(3.3.9)  -  ■j—y  <_  Cov(H^  ,Hj )  <  0. 


On  the  other  hand,  when  N  =  2m,  m  >  0,  we  get 


88 


hm  s  ’Wl  =  "V 

Consequently,  in  this  case  also  we  obtain  results  (3.3.6)  through 

(3.3.9)  except  that  the  summations  in  (3.3.7)  and  (3.3.8)  will  be 

from  m+1  to  N  instead  of  m+2  to  N. 

Gupta  and  McDonald  (1970  )  derived  the  exact  distribution  of 

max  H.  -  H.  for  the  case  of  a(R4^)  =  R.-  for  k  =  3  and  n  =  2(1)5. 
l<j<k  J  1  U  U 

Also,  for  a(R^j)  =  R^j,  is  the  well-known  Mann-Whitney  U-statistic. 

But  in  general  the  distribution  of  max  H.  -  H.  is  not  known  since 

l<j<k  J  1 

it  depends  on  G.  However,  with  a(r)  defined  as  in  (3.3.3),  for 
k  =  3  and  d  _>  0, 

Pr{  max  H.  -  H.  <  dl  =  PrfH^-H.  <  d,  H,-H,  <  d) 
l<j<3  J  1  _  1  '  ~  3  1  ~ 

can  be  evaluated  on  the  computer.  Without  loss  of  generality,  one 
can  assume  that  a  «  0.  Table  III .1 ,  Table  III. 2,  and  Table  III. 3 
provide,  respectively,  the  values  of  a(r),  d-values  for  the  procedure 
R-j(G),  and  D-values  for  the  rule  R^G),  respectively ,  for  k  =  3, 
n  =  3,5,  and  (e,  y)  =  (0.57735,  1.00000),  (0.19745,  0.13491), 
(-0.0006589,  -0.0003630),  (-0.16857,  -0.080199).  In  Tables  III. 2 
and  III. 3,  we  choose  P*  =  0.75,  0.90,  0.95,  0.975  and  0.99.  The 
four  choices  of  (e,  >)  specified  above  correspond  to  the  cases 
where  the  lambda  distribution  can  be  used  to  approximate  uniform, 
normal,  logistic  and  double  exponential  distributions,  respectively, 
each  with  mean  0  and  variance  1.  Accordingly,  these  choices  are 
denoted  in  the  tables  by  U,  N,  L,  and  D,  respectively. 
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Finally,  we  briefly  discuss  how  approximate  values  of  d  and  D  can 
be  obtained  using  asymptotic  theory. 

Theorem  3.3.2.  For  16^  and  for  the  rule  R^G), 

P(CS|Rj{G))  *  f  $k_1(x  +  d$(x) , 

•00 

2 

where  v  =  Var(H.. )  -  Cy,  Cy  is  common  covariance  between  Hi  and  for 
i  4  J *  and  $(x)  is  the  cdf  of  a  standard  normal  distribution. 


Proof.  By  checking  Lindeberg's  condition,  one  can  show  that 
nH.j/>/Var(H.j )-Cy  is  asymptotically  normally  distributed.  Hence  the 
result  follows. 

The  value  of  d  satisfying 

/  ♦“-’(x  +  —)  d4>(x)  *  P* 

•oo 

can  be  obtained  from  the  tables  of  Gupta  (1963),  Gupta,  Nagel  and 
Panchapakesan  (1969)  or  Gupta,  Panchapakesan  and  Sohn  (1985),  who  have 
tabulated  h  *  nd//2v. 

Similarly  the  following  theorem  holds  for  the  rule  Rg(G). 


Theorem  3.3.3.  For  e_  $  and  N  =  2m+l , 
P(CS|R3(G))  -  $k(^), 


2 

where  w  = 


2(k-l) 

nk(kn-V) 


kn 

j=n+2 


Proof.  Proof  is  analogous  to  that  of  Theorem  3.3.2  and  hence  omitted. 

-1  1  /k 

From  the  above  theorem,  we  have  D  =  4>  (nwP*  '  ). 


3.3.2  Evaluation  of  Constants  for  R,(G)  and  R,(G)  using  scores  at(r) 


I  W  W 

In  this  section,  we  use  a  score  function  ag(r)  (to  be  defined 
later)  in  the  rules  R-j(G)  and  Rg(G)  and  evaluate  the  associated 
constants  d  and  D. 

In  order  to  define  the  scores  ag(r),  consider  the  density  d(x,^), 
on  an  interval  containing  the  origin,  satisfying  the  following 
regularity  conditions. 

(i)  d(x,e)  is  absolutely  continuous  in  e  for  almost  every  x: 

(ii )  the  limit 


d(x,0)  =  1  im  [d(x,e)  -  d(x,0)] 
e-C  r 

exists  for  almost  every  x: 

(Hi)  lim  /  jd(x,6)|dx  =  /  |d(x,0)|dx  <  <*> 

“°°  —00 

holds,  with  d(x,6)  denoting  the  partial  derivative  with  respect  to  e. 
Note  that  the  existence  of  d(x,o)  for  almost  every  e  is  insured  at 
every  point  x  such  that  d(x,e)  is  absolutely  continuous  in  e.  This, 
however,  does  not  make  the  condition  (ii)  superfluous. 

In  deriving  locally  most  powerful  tests  for  equality  of  location 
Gupta,  Huang  and  Nagel  (1979)  used  the  score  function  aji(r)  defined  by 


(3.3.10) 


eS(r)  -  E 


d(xjr),0) 


d(Xki'  ,0) 


where  '  denotes  the  r-th  order  statistic  in  a  sample  of  size  N  from 
the  distribution  with  density  d(x,0).  For  the  location  parameter  case 
at(r)  can  be  written  as 
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(3.3.11) 


a0(r) 


^F~1(U(r),0),0) 

f(F_1(U(r)t0),0) 


» 


where  denotes  the  r-th  order  statistic  in  a  sample  of  size  N 
from  the  uniform  distribution.  Now,  specifying  d(x,e)  to  be  the 
symmetric  lambda  density  with  parameters  e(location),  y(scale)  and 


(scale),  we  obtain 

r  1 


aS(r)  -J 


/'  n(n->)  du.  6  >  o. 

o  M  v^u^+d-ur-1)2 

/’  N(n-')  du,  b<o. 

0  r-'  ,!(u’-V|l-»)’:')2 


For  k  =  3,  n  =  3,5,  and  selected  values  of  (b,y)  which  were 
denoted  by  U,  N,  l  and  D  earlier  in  Section  3.3.2,  the  values  of  aj^(r) 
are  tabulated  in  Table  III. 4.  For  the  same  values  of  k,  n  and  (s,y), 
the  constants  d  and  D  are  given  in  Tables  III. 5  and  III. 6,  respectively, 
with  P*  *  0.75,  0.90,  0.95,  0.975,  0.99  in  each  case. 

Remark:  Nagel  (1970)  and  Gupta,  Huang  and  Nagel  (1979)  have  derived 

locally  optimal  subset  selection  procedures.  It  follows  from  their 

results  that  the  rule  Rj(G)  is  locally  optimal  in  the  sense  that  the 

rule  maximizes  the  PCS  in  a  neighborhood  of  any  £€«q  among  all  rules 

which  satisfy  inf  P(CSjR)  =  P*. 

-€'U 
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3.3.3.  Comparisons  of  the  Procedures  R^G)  and  R^(G). 

As  we  have  stated  in  Section  3.3.1,  the  procedures  R^(G)  and 
(^(G)  are  compared  in  terms  of  EFF(R),  which  is  used  as  a  measure  of 
efficiency.  A  large  value  indicates  high  efficiency. 

For  a  proper  comparison  of  the  two  procedures,  we  should  have  the 
constants  d  and  D  such  that  the  two  procedures  will  have  the  PCS 
approximately  equal  to  P*  for  e_g  .  In  our  Monte  Carlo  studies  with 
k*3.  this  led  to  the  choice  of  P*  =  .90,  0.95,  0.975  for  n=3,  and 
P*  »  0.75,  0.90,  0.95,  0.975  for  n=5.  Further,  we  considered  normal, 
logistic,  and  double  exponential  distributions  all  with  variance  1,  as 
three  possible  choices  of  the  underlying  distributions.  Let  &1 ,  6^, 
e3  be  the  means  of  the  three  populations  it^,  t^,  t^.  We  considered 
four  different  configurations  of  £  *  (e-j^.e^,  namely, 

I:  e  *  (0,0, 0.1),  II:  e  *  (0,0,0. 5), 

III:  e  =  (0,0,1),  IV:  e  =  (0,0. 5, 1.0). 

Fo^  comparisons  using  the  score  function  a(r),  we  chose  the  four 
choices  of  the  parameter  (s,y)  of  the  lambda  distribution,  referred  to 
by  U,  N,  L,  and  D  in  Section  3.3.1.  For  comparisons  using  ag(r),  the 
choice  of  (&,y),  denoted  by  UD,  is  made  so  that  the  lambda  distribution 
can  be  used  to  approximate  the  underlying  distributions  with  variance  1. 

For  each  choice  of  the  underlying  distribution,  random  samples 
were  generated  by  using  the  random  number  generator  RVP,  developed 
by  Professor  Rubin  at  Purdue  University.  Our  results  are  based 
on  1000  simulations  in  the  case  of  n  =  3  and  500  simulations  in 


the  case  of  n=5.  Table  III.7  is  reproduced  for  the  cases  where  the  unde- 
lying  distributions  are  normal  and  logistic  distributions  with  the 
mean  configuration  II  for  (n,P*)  =  (3,0.90);  the  patterns  in  the  other 
case  are  similar. 

Besides  comparing  the  efficiencies  of  the  rules  R-j  (G)  and  R_(G) 
under  each  choice  of  G,  we  are  also  interested  in  comparing  the 
different  choices  of  G  for  each  rule.  Based  on  the  Monte  Carlo  study, 
our  conclusions  are  summarized  below. 

(1)  When  the  means  are  close  to  each  other,  no  rule  performs 
uniformly  better  than  the  other  when  the  underlying  distributions  are 
normal  or  double  exponential;  however,  as  P*-»-l ,  the  rule  R3(G)  performs 
slightly  better  than  the  rule  R-|(G).  With  means  close  to  each  other, 
the  situation  changes  when  the  underlying  distributions  are  uniform  or 
logistic:  Then,  the  rule  R3(G)  performs  almost  uniformly  better  than 
the  rule  R^G). 

(2)  When  the  largest  mean  is  sufficiently  away  from  the  next 
largest,  the  rule  R^G)  generally  performs  better  than  the  rule  R3(G) 
no  matter  what  the  choice  of  G  is.  This  behavior  becomes  more  clear  as 
n  increases.  Also,  when  P*  is  close  to  1,  the  difference  in  the  perfor 
mances  of  the  two  rules  narrows  down,  even  though  R-j(G)  still  is  better. 

(3)  Generally,  the  rule  R-j(G)  performs  better  than  the  rule  R3(G) 
when  the  choices  of  G  are  the  lambda  distribution  to  be  the  uniform 
and  the  underlying  distribution  F  (i.e.,  G  is  U  or  UD)  both  with  var¬ 
iance  1 . 

(4)  Considering  the  efficiency  of  the  procedure  R^G),  the  best 
choice  of  G  is  the  lambda  distribution  which  approximates  the  uniform 
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distribution  with  unit  variance  (i.e.,  G  is  U). 

(5)  For  the  rule  R3(G),  the  best  choice  of  G  is  the  lambda  distri 
bution  approximating  the  underlying  distribution  with  unit  variance. 
This  is  all  the  more  clear  when  the  underlying  distributions  are  normal 
or  double  exponential  with  their  means  close  to  each  other. 


Considering  all  the  findings  of  the  study,  the  overall  recommenda 
tions  will  be: 

(1)  When  the  means  of  the  underlying  distributions  are  expected 
to  be  close  to  each  other,  use  either  the  rule  R-j(G)  with  U  as  the 
choice  for  G  or  the  rule  R3(G)  with  UD  as  the  choice  for  G. 

(2)  When  the  largest  mean  is  expected  to  be  sufficiently  away 
from  the  next  largest,  use  the  rule  R,(G)  with  U  as  the  choice  for  G. 


AD-A159  193  MULTIPLE  DECISION  PROCEDURES  FOR  TUKEV'S  GENERALIZED 
LAMBDA  DISTRIBUTIONS <U>  PURDUE  UNIV  LAFAVETTE  IN  DEPT 
OF  STATISTICS  J  K  SOHN  AUG  89  TR-89-28 
UNCLASSIFIED  N88814-84-C-8167  F/G  12/1 


95 


Table  III .1 


Values  of  a(r) 
where  =  {§  € 

under  ^  for 

n|e1  =  e2  = 

k=3, 

e3} 

n 

a(r) 

U 

N 

L 

D 

3 

a(9) 

1.38552 

1.48669 

1.49804 

1.49582 

a  (8) 

1.03914 

0.93118 

0.87778 

0.83529 

a  (7) 

0.69276 

0.57013 

0.52348 

0.48933 

a  (6) 

0.34638 

0.27334 

0.24800 

0.22992 

a(5) 

0. 

0. 

0. 

0. 

5 

a  ( 1 5) 

1.51541 

1.73896 

1.79233 

1.81764 

a  ( 14) 

1.29893 

1.24834 

1.20149 

1.15927 

a(13) 

1.08240 

0.94605 

0.88346 

0.83506 

a(12) 

0.86595 

0.71257 

0.65382 

0.61080 

a  ( 11 ) 

0.64936 

0.51350 

0.46595 

0.43213 

a  ( 10) 

0.43298 

0.33363 

0.30065 

0.27756 

a  (9 ) 

0.21649 

0.16441 

0.14759 

0.13591 

a(8) 

0. 

0. 

0. 

0. 

Note 

For  n»3,  a ( i )  * 

-a(10-i),  i= 

1 .... ,4  and 

for 

n*5,  a(i)  =  -a ( 16-i ) ,  i*l,.. 

.,7. 
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Table  III. 2 

d-values  of  the  procedure  R1  (G)  under 
*  {e_€  n|e^se2*6j}  for  k=3 


P*  n 

U 

N 

L 

D 

3 

5 

3 

5 

3 

5 

3 

5 

0.75 

2.423 

3.156 

2.431 

3.173 

2.402 

3.135 

2.388 

3.094 

0.90 

3.809 

4.887 

3.644 

4.825 

3.597 

4.750 

3.538 

4.684 

0.95 

4.501 

5.843 

4.264 

5.744 

4.227 

5.648 

4.114 

5.556 

0.975 

4.848 

6.619 

4.747 

6.490 

4.644 

6.370 

4.545 

6.249 

0.99 

5.194 

7.485 

5.131 

7.288 

5.026 

7.124 

4.920 

6.984 

'was 


£ 
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Table  III. 4 

Values  of  ag(r)  for  some  values 
of  (b,y)  and  n=3,5 


n 

a$(r) 

N 

L 

D 

3 

>3(9) 

10.95367 

3997.81042 

18.30010 

>5(8) 

6.96000 

2999.62692 

14.97188 

>5(7) 

4.31341 

2000.12774 

10.39644 

>5(6) 

2.08126 

1000.15792 

5.30546 

>5(5) 

0.0 

0.0 

0.0 

5 

>505) 

12.76184 

4371.83812 

19.28142 

>504) 

9.24459 

3748.92094 

18.05537 

>503) 

7.09456 

3124.76751 

15.75637 

*'*o(12) 

5.39158 

2500.15400 

12.98432 

>501 ) 

3.90891 

1875.28911 

9.93465 

>50°) 

2.54966 

1250.26286 

6.71000 

>5(9) 

1.25921 

625.15168 

3.38000 

>5(8) 

0.0 

0.0 

0.0 

Note  n=3. 

a*(l)  =  -a*(9),. 

..,a*(4)  =  -85(6) 

.  Also  for 

n=5. 

a£(l)  -  -5(15), 

...,a*(7)=  -a*(9) 

• 

^  »• v>  -  •  *  , •>  *  •  * 


••  .v>V 


Table  III. 5 


Table  III. 6 

Values  of  D  of  the  rule 
MG)  with  ai(r) 


-6.96000 

-2997.84061 

-13.20912 

13.18583 

-4997.96834 

-23.60556 

15.83242 

-5999.91257 

-30.67377 

17.91368 

-6998.09608 

-34.00199 

22.22709 

-8997.56508 

-43.66842 

-9 . 20044 

-3746.81187 

-16.98243 

16.89421 

-6870.99386 

-32.07069 

21.33906 

-8125.32180 

-40.66679 

25.02452 

-9996.04816 

-47.27144 

29.05684 

-11249.13155 

-56.14283 

Table  HI. 7 


EFF(MG))  0.374  0.378  0.378  0.378  0.382  0.348  0.353  0.353 

(0.003)  (0.003)  (0.003)  (0.003)  (0.003)  (0.003)  (0.003)  (0.003 


In  Section  4.2  we  give  notations  and  definitions  including  the 
definition  of  the  100(l-2a)«  HPD  credible  region. 

In  Section  4.3  we  propose  a  procedure  R(a,d)  which  selects  the 
best  after  retaining  a  subset  of  populations  at  stage  1  and 
investigate  its  properties. 

4.2.  Framework 

Let  r  i , . . .  ,-rr be  i  :ndependent  normal  populations  with 

unknown  means  6^,...,e^,  respectively  and  unknown  common  variance 
2  2 

o  (0  <  c  <  «■).  Also  let  a  random  variable  X.  be  the  observable 

characteristic  associated  with  * . .  For  i  =  l,2,...,k,  let 

-i  =  (Xii »• • • »Xin5  be  a  vector  of  n  independent  observations  from 

i  =  l,2,...,k,  respectively.  Assuming  that  very  little  is 

known  to  the  experimenter  about  the  prior  distribution  of 
? 

ek.  o  ),  we  may  use  a  locally  uniform  joint  prior  density 
2-2  2 

t(b1  .e^.- •  •  *ek*c  )  =  a  I^0  ^(o  ),  which  is  also  a  noninformative 
prior  for  the  model,  where  I^(x)  is  the  usual  indicator  function. 
Let  Ti (e^ , . . . ,ek ! ,. . .  ,Xk)  be  the  marginal  joint  posterior 
distribution  of  e'  =  (6^,...,e^)  given  X'  =  (X^,...,Xk). 

ri  is  said  to  be  'good'  ('bad')  if  e.  >  e0  { e i  <  6q), 
where  6q  is  a  control  or  standard  which  is  specified  a  priori  by 
the  experimenter.  Let  (X)  =  (6 (^  ) , . . .  ,6^  (X^) ) ,  where 

6^(Xi)  is  a  nonrandomized  decision  rule  for  at  stage  1,  i.e., 

4jl>(Xi,  =  1  if  Tri  is  accepted  as  a  good  population  and 

6?^^(X.)  =  0  if  r .  is  rejected  as  a  bad  one.  Let  the  loss  function 


Lv  '(e,  6V  ;(X))  at  stage  1  be  as  follow: 


(4.2.1)  L(1)(e,6(1)(X))  =  J  Li(1)(ei,«i(1)(Xi)), 

where  Lp^(e.j,  is  loss  due  to  the  decision  6p^(X.j)  about 

such  that 


ko 

if 

6 -^(X,.)  =  1  and  ei  <_  e0 

(4.2.2) 

Li(1)(er6j1)(xi))  =  < 

kl 

if 

6p)(Xj)  =  0  and  ei  > 

1 

0 

otherwise, 

in  other  words,  a  loss  due  to  selecting  each  bad  population  is  kQ 
and  a  loss  due  to  rejecting  each  good  population  is  . 

Remarks : 

One  might  question  the  suitability  of  a  loss  of  this  kind  in 
this  problem.  However,  a  loss  function  of  this  kind  can  be  proper 
for  the  two-component  decision  problems,  because  the  loss  function 
of  this  kind  can  reflect  the  importance  of  two  types  of  possible 
misclassification  errors.  For  our  situation,  at  stage  1,  we  ’only’ 
want  to  classify  populations  into  possible  good  and  bad  populations. 
Thus  at  stage  1  our  problem  can  be  regarded  as  the  k  two-component 
decision  problems.  Problems  of  this  type  have  been  investigated  by 
Lehmann  (1957). 

( 2 ) 

Let  our  final  nonrandomized  decision  6  ( Y )  at  stage  2  be 

(21 

6  (Y)  =  {j:  j  fc  S } ,  where  Y'  =  (Y-j « •  •  •  »¥s)  are  combined  samples 

from  stage  1  and  stage  2  for  populations  in  S  where  S  is  a  selected 


subset  at  stage  1  with  size  s.  Let  a  loss  due  to  the  decision 
6^(Y)  be 

(4.2.3)  L(2)(e,  <s(2)(Y))  =  He  f  6[k]}, 

Now  we  give  the  definition  of  the  100(l-2a)*  HPD  credible 
region  which  we  will  use  at  stage  2. 

Let  t -j (e | X)  be  the  marginal  posterior  density  of  e  given  X. 

Definition  4.2.1  (see  Berger  (1980)).  The  100(l-2a)%  HPD  credible 
region  for  e  is  the  subset  C^_2aj  of  the  parameter  space©  of  the 
form 

(4.2.4)  Co-2c0  =  {e  6  ®;  T1  (e) x  «  x)  >  s2q}, 
where  c2a  is  the  largest  constant  such  that 

(4.2.5)  Pr(C(l-2a) I?  =  5)  -  1_2a' 

Remark: 

If  t -j  ( ©  | X )  is  not  unimodal,  then  the  credible  region  C^_2n) 
may  consist  of  several  disjoint  intervals. 

4.3.  Goal  and  a  Proposed  Procedure  R(a,d) . 

Assume  that  no  knowledge  is  available  concerning  the  correct 
pairing  between  populations  and  the  ordered  e^s.  Our  goal  is  to 
select  the  population  associated  with  the  largest  unknown  mean,  if 
any,  from  the  set  of  good  populations.  The  procedure  R(a,d)  is 
designed  to  meet  the  goal. 
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4.3.1.  Definition  of  the  Procedure  R(a.d) . 

Stage  1 .  Take  =  max{2,  +  1}  observations  from 

each  population  t^. ,  where  is  the  100(1 -a)  percentile  of  the 

standard  normal  distribution  and  [a]  is  the  largest  integer  <_  a. 

Note  that  2d  corresponds  to  the  width  of  the  100(1 -2a)%  HPD  credible 
region  for  e,  which  is  to  be  specified  by  the  experimenter. 

Now  based  on  first  stage  samples,  we  select  a  subset  S  by  the 


in  S  till  N-nQ  observations  are  taken  such  that 
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(4.3.1)  N  *  inf{n:  n  >_  max{n0>  [t^V*/d^q]+l ) , 

where  t  is  the  100a  lower  percentile  of  the  Student's  t  distribution 
with  q  *  (k-s)(nQ-l)  +  s(n-l)  degrees  of  freedom  and 

v?  »  I  E°  (Xij  "  *1>2  ♦  I  I  (Yn  -  ^)2.  and  ?.  =  l  Y  /n. 

'  ifS  j-1  ,J  1  i€S  j«1  1J  1  1  j=l  J 

Then  our  final  decision  at  stage  2  is 


62( Y )  =  {j:  j  €  S  and  V.  *  max  ?  }, 

J  l<n<s  * 

that  is,  to  select  the  population  associated  with  the  largest 
overall  sample  mean  and  claim  it  to  be  the  best  population  among 
good  populations. 


4.3.2.  Properties  of  the  Procedure  R(a,d). 

It  is  easy  to  verify  that  the  marginal  joint  posterior  joint 
density  t1 (e1 , . . . ,ek| X-j , . . .  ,Xk)  at  stage  1  follows  a  multivariate  t 
distribution  with  variance-covariance  matrix  W  =  V  I,  where  I  is  a 
kxk  identity  matrix.  Hence  the  marginal  posterior  density  of  e.. 
given  X-,,...,Xk  at  stage  1  follows  a  Student's  t  distribution  with 
k(ng-l)  degrees  of  freedom,  a  location  parameter  and  a  scale 
parameter  V.  Similarly,  at  stage  2  the  marginal  posterior  density 
of  ei  of  iri  in  S  given  { X . ,  i  ^  S)  and  Y  follows  a  Student's  t 
distribution  with  q  =  (k-s)(ng-l)  +  s(N-l)  degrees  of  freedom,  a 
location  parameter  ?.  and  a  scale  parameter  Q,  where 


q2  =  Ji 


l  (X.,  -  Sj)2  *  [  I  <v  -  ?  ); 
j«l  1J  1  i€S  j=l 
qN 


(4.3.2) 


Hence  the  following  theorem  holds. 


Theorem  4.3.1 .  The  stopping  rule  N  provides  the  100(l-2a)%  HPD 
credible  region  with  a  common  width  2d  for  each  selected  population 
at  stage  1 . 

Proof.  The  proof  is  straightfoward  and  hence  omitted. 

Remark: 

Since  the  loss  l^(e,6^(X))  at  stage  1  is  linear  and 
additative,  the  decision  rule  6^(X)  is  Bayes.  This  follows  from  the 
fact  that  EdP^e.,  {1})]  *  kQPr{ei  <  eQ f X }  and 

E[L^  (e^.  {Q})]  a  k-jPrfe^  >  eQ | X } ,  for  i  =  l,...,k,  respectively. 


Theorem  4.3.2.  Let  n  *  o2Z2.j_a)/d2.  Then  for  a  fixed  o2(0  <  o2  <  «>) 
and  the  stopping  rule  N, 

(a)  N/n  -+  1  a.s.  as  d  -►  0 


(b)  lim  E(N/n)  *  1  (asymptotic  efficiency), 
d-0 


Proof.  From  the  definitions  of  n^  and  N,  one  can  get  the  following 
inequalities; 


(4.3.3) 


t  2V? 
a  1 


t2V2  Z« 


Vi  <  N  <  -vL  +  -liial  +  4. 
d^q  dZq  0 


2  .  2 


Since  n^  -►  ®  and  N  -*  ®  as  d  -*•  0  hence  S  a  a.s..  Thus  (a)  and 
(b)  follow. 


To  examine  the  performance  of  the  procedure  R(a,d)  a  Monte 
Carlo  study  was  carried  out  for  k  *  5,  a  »  0.025,  0.05  with  300 
simulations.  To  generate  normal  random  variates  with  common 
variance  1,  the  random  number  generator  RVP  developed  by  Professor 
Rubin  was  used.  As  underlying  configurations  of  means  (supposed 
to  be  unknown  to  the  experimenter) ,  we  chose  four  different 
configurations  with  d  =  0.4,  namely, 

(I)  6  =  (-0.2, 0,0,0. 2, 0.4)  (II)  e  =  (-0.2, -0.2, 0,0. 2, 0.4) 

(III)  *  =  (-0.2, -0.2, 0,0, 0.2)  (IV)  6  =  (-0.2, -0.2, -0.2, 0,0. 2). 

The  value  of  eQ  was  supposed  to  be  0.  As  a  special  case  under  the 
configuration  (IV),  d  =  0.2  was  also  chosen  and  is  called  configura¬ 
tion  V.  Basically  four  statistics  were  simulated:  (a)  the  expected 
subset  size  S  at  stage  1  (E(S)),  (b)  the  expected  value  of  the  overall 
sample  size  N  (E(N) ) ,  (c)  the  expected  loss  at  stage  1  (E(L1))  and  (d) 
the  probability  of  selecting  the  population  associated  with  the  larg¬ 
est  mean  (PSB).  For  the  loss  function,  (k^)  =  (1,1),  (1,2),  (2,1), 
(1,5)  and  (5,1)  were  considered.  The  results  are  shown  in  several 
figures,  where  each  figure  contains  five  different  configurations  for 
a  =  0.025.  In  each  of  four  figures,  the  abscissa  is  the  ratio 
k-j/kQ.  Thus  Figure  1  is  E(S)  versus  k1  /kQ ;  Figure  2  is  E(N) ; 

Figure  3  is  PSB;  and  Figure  4  is  E(L1).  Figures  for  a  =  0.05  are 
similar  to  these  figures  drawn  for  Q  =  0.025  and  hence  are  omitted. 

The  results  indicate: 

(1)  As  k,/kn  increases,  the  values  of  PSB  increases. 


Ill 


(2)  In  general,  the  value  of  E(N)  increases  as  k-j/kg  increases. 

(3)  Values  of  /kQ  are  irrelevant  to  the  values  of  E(tl). 

(4)  When  the  number  of  good  populations  among  five  populations 
decreases,  the  value  of  E(S)  decreases  but  the  value  of  E(L1) 
increases  slightly. 

(5)  When  the  value  of  d  decreases,  the  value  of  PSB  increases. 

But  when  the  overall  sample  size  required  and  the  value  of  E(S)  are 
taken  into  consideration,  the  rule  R(a,d)  does  not  provide  vast 
improvement  on  PSB.  This  is  mainly  due  to  the  fact  that  an 
elimination-type  procedure  cannot  recover  the  best  population  at 
stage  2  if  it  has  been  eliminated  at  stage  1. 

(6)  For  fixed  values  of  the  ratio  ^/kg,  as  the  distance  between 
the  largest  mean  and  the  smallest  mean  increases,  the  values  of  PSB 
increase  and  the  values  of  E(ll)  decrease  (slightly). 


<1  U  (j 
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Figure  1.  E[S]  versus  the  ratio  /kQ 
for  five  configurations. 

Legend  of  Configurations 

(I)  e  =  (-0.2, 0,0, 0.2, 0.4)  with  d  =  0.4 

v  (II)  fi  =  (-0.2, -0.2, 0,0. 2,0. 4)  with  d  =  0.4 

(III)  0  =  (-0.2, -0.2, 0,0, 0.2)  with  d  =  0.4 

(IV)  e  =  (-0.2, -0.2, -0.2, 0,0. 2)  with  d  =  0.4 

(V)  e  =  (-0.2, -0.2, -0.2, 0,0. 2)  with  d  =  0.2 


S.000 


Figure  2.  ELNj  versus  the  ratio  Vkc 
for  five  configurations. 

Legend  of  Configurations 

(I)  e  •  (-0.2, 0,0,0. 2,0. 4)  with  d  *  0.4 

(II)  s  •  (-0.2, -0.2, 0,0. 2,0. 4)  with  d  «  0.4 

(III)  e  «  (-0.2, -0.2, 0,0,0. 2)  with  d  -  0.4 

(IV)  e  -  (-0.2, -0.2, -0.2,0, 0.2)  with  d  «  0.4 

(V)  e  *  (-0.2, -0.2, -0.2, 0,0. 2)  with  d  «  0.2 


:  §1 


Figure  4.  E[nj  versus  the  ratio  k^/k 
for  five  configurations. 


Legend  of  Configurations 


A  (I)  e  *  (-0.2, 0,0, 0.2, 0.4)  with  d 
O  (II)  fi  s  (-0. 2,-0. 2,0,0. 2, 0.4)  with  d 
A  (III)  e  «  (-0.2, -0.2,0, 0,0.2)  with  d 
O  (IV)  e  =  (-0. 2,-0. 2,-0. 2,0,0. 2)  with  d 
□  (V)  e  «  (-0. 2,-0. 2,-0. 2,0,0. 2)  with  d 
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